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Chapter  1 


INTRODUCTION 


1.1  General 

This  technical  note  describes  a  computer  simulation  model  (WEAS1M)  that 
genrates  time  series  of  synthetic  observations  of  ceiling  and  visibility  at 
selected  locations.  WEASIM  was  developed  by  the  United  States  Air  Force  Environ¬ 
mental  Technical  Applications  Center  (USAFETAC)  in  response  to  a  request  by  Head¬ 
quarters  United  States  Air  Force,  Studies  and  Analyses  (HQ  USAF/SA)  for  such  a 
model.  HQ  USAF/SA  desired  this  weather  simulation  model  for  inclusion  in  several 
theater  level  simulations  of  military  operations.  The  purpose  of  this  technical 
note  is  to  provide  nonprogramming  analysts  using  WEASIM  with  tne  details  of  the 
algorithms  used  in  the  model  and  the  techniques  employed  for  model  verification 
and  validation. 

1.2  Brief  Model  Description 

WEASIM  uses  a  first  order  Markov  process  called  the  Ornstein-Uhlenbeck  (0-U) 
equation  and  a  sawtooth  wave  submodel  to  generate  the  synthetic  observations. 
The  model  preserves  the  unconditional  probabilities  of  occurrence  of  ceiling  and 
visibility  as  well  as  maintains  the  desired  temporal,  spatial,  and  cross  variable 
correlations.  The  Ceiling/Visibility  Simulation  Model  is  tuned  to  a  particular 
geographic  area  by  inputing  modeling  coefficients  and  correlation  parameters 
specifically  determined  from  observed  weather  data  for  that  area. 

1 . 3  Computer  Requirements  of  the  Model 

The  Ceiling/visibility  Simulation  Model  consists  of  11  FORTRAN  subprograms 
totaling  1630  lines  of  source  code  with  extensive  comments  (218  executable  state¬ 
ments).  Procedure  plus  data  for  the  11  subprograms  occupy  42K  bytes  of  core 
storage  on  a  32-bit  general  purpose  computer.  The  size  estimate  is  based  on  the 
model  configured  to  handle  a  maximum  of  200  locations  and  assumes  that  a  data 
base  of  modeling  coefficients,  correlation  parameters,  and  trigonometric  func¬ 
tions  of  latitude  and  longitude  have  been  read  from  tape  or  disk  files  into 
labelled  COMMON  blocks.  No  allowance  was  made  to  estimate  core  storage  for  an 
input  file  control  block  and  buffer.  As  shown  in  figure  1-1,  WEASIM  is  designed 
to  operate  as  a  miniature  subservient  simulation  within  the  user's  overall  simu¬ 
lation  model. 

Run  time  is  approximately  0.001  Central  Processing  Unit  (CPU)  seconds  per 
observation  pec  location  on  an  IBM  3033  computer  running  under  0S/VS2  MVS.  This 
estimate  is  based  on  the  time  required  to  produce  the  synthetic  observations  and 
store  them  in  a  labelled  COMMON  block.  It  does  not  include  I/O  time. 
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Figure  l-l .  WEASIM  as  a  Server-model  Within  the  User's 
Simulation  Model. 
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FUNCTIONAL  DESCRIPTION  OF  THE  MODEL 


2.1  Overview 


In  recent  years  weapon  system  effectiveness  studies,  strategy  and  doctrine 
development  simulations,  and  combat  gaming  models  have  become  much  more  sophis¬ 
ticated.  Advances  in  computer  storage  capability  and  CPU  efficiency  allow  newer 
application  models  to  describe  military  operatic  .?  in  more  detail  than  ever 
before.  Considering  weather  effects  on  military  operations  is  just  one  aspect  of 
this  general  improvement  in  modeling  capabilities . 

2.1.1  Historical  Background  of  the  Model  Development.  HQ  USAF/~n  has  a  continu¬ 
ing  requirement  for  weather  information  as  input  to  their  application  models  and 
studies.  Since  approximately  1978,  this  need  for  weather  data  has  been  partially 
met  by  using  historical  weather  data  (some  of  the  problems  involved  with  the  use 
of  historical  weather  data  will  be  discussed  in  paragraph  2.1.3  of  this  technical 
note).  In  1981  the  first  attempt  at  providing  synthetic  weather  data  generated 
from  a  simulation  model  was  made.  This  first  generation  environmental  simulation 
model  was  developed  by  HQ  USAF/SAGW.  It  provided  a  quick  efficient  method  for 
supplying  correlated  weather  observations  at  two  points  to  parent  simulations. 
The  main  weakness  of  this  model  was  that  it  could  not  consider  more  than  two 
locations  and  still  preserve  spatial  and  temporal  consistency  with  respect  to  all 
points.  In  the  early  1980 's,  the  increasing  sophistication  of  the  effectiveness 
and  evaluation  studies  being  conducted  made  it  apparent  that  the  existing  methods 
could  no  longer  meet  analysts'  needs. 

In  September  1981,  USAFETAC  was  asked  to  develop  a  ceiling/visibility  simula¬ 
tion  technique  capable  of  replacing  the  existing  methods.  This  request  was  the 
genesis  of  a  2-year  technique  development  effort  during  which  the  Ceiling/Visi¬ 
bility  Simulation  Model  and  a  complete  data  base  of  modeling  coefficients  for 
Northern  Europe  was  developed,  tested,  and  delivered.  The  remaining  paragraphs 
of  this  technical  note  detail  the  requirements  imposed  on  the  Ceiling/Visibility 
Simulation  Model,  explain  the  model  itself,  list  its  important  assumptions  and 
limitations,  and  describe  its  performance  in  terms  of  preserving  unconditional 
and  joint  probabilities,  and  correlations. 

2.1.2  The  Use  of  Weather  for  Scoring  or  Decisionmaking.  Application  models, 
such  as  weapon  systems  effectiveness  simulations  and  combat  evaluation  models, 
can  use  weather  information  for  two  different  activities.  The  most  common  use  of 
weather  information  in  military  studies  and  analyses  is  mission  assessment  or 
"scoring."  In  scoring,  the  model  makes  a  decision  based  on  weather  and  other 
factors,  as  to  whether,  for  example,  a  given  target  is  successfully  "killed"  by 
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aerial  attack.  For  scoring  decisions  impacted  by  weather,  the  using  models  needs 
the  values  of  mission-critical  weather  variables  (such  as  ceiling  and  visibility) 
at  the  time  the  mission  is  executed. 

The  second  use  for  weather  information  by  application  models,  is  an  attempt 
to  emulate  certain  aspects  of  the  human  decisionmaking  process  as  applied  in 
combat.  From  a  meteorological  perspective,  human  decisions  can  be  classified  as 
either  (1)  short-range  execution  decisions  based  on  the  observed  present  weather, 
or  (2)  longer  range  planning  decisions  based  on  forecasts  of  future  weather. 
Potential  users  of  the  Ceiling/Visibility  Simulation  Model  should  keep  in  mind 
that  WEASIM  is  ideally  suited  for  the  functions  of  scoring  and  short  term  deci¬ 
sionmaking.  Future  enhancements  of  the  model  may  include  a  capability  to  produce 
forecasts  and  will  be  discussed  in  paragraph  2.3.4  of  this  technical  note. 

2.1.3  Synthetic  Weather  vs.  Direct  Use  of  Historical  Weather.  Analysts  studying 
weather  sensitive  military  operations  or  weapons  systems  have  two  courses  of 
action  in  order  to  include  weather  effects  in  their  studies.  ‘One  way  is  to  use 
historical  weather  data  directly.  The  other  is  to  use  a  simulation  model  such  as 
WEASIM.  There  are  many  problems  with  the  first  approach.  USAFETAC  maintains  over 
120,000  computer  tapes  at  Scott  AFB,  IL  and  at  Operating  Location  A  (USAFETAC, 
OL-A),  Asheville,  NC.  These  tapes  contain  the  historical  weather  record  for 
literally  thousands  of  locations  world  wide.  Obviously  a  potential  user  of 
weather  data  for  studies  cannot  maintain  such  a  tremendous  tape  based  data  set. 
More  commonly  a  user  is  interested  in  the  weather  for  a  few  locations  at  any  one 
time.  in  this  case,  USAFETAC  can  prepare  special  data  tapes  for  the  user  con¬ 
taining  the  weather  data  for  those  selected  locations  only.  This  approach  leads 
to  still  other  problems.  Historical  weather  records  are  almost  always  very  short 
from  the  point  of  view  of  statistical  sampling  theory.  The  period  of  record  (POR) 
does  not  contain  all  the  patterns  of  weather  likely  to  occur  in  the  future  and 
the  worst  weather  on  record  is  not  the  worst  possible  weather.  Any  historical 
POR  is  but  one  realization  of  a  stochastic  time  series.  Future  realizations  will 
resemble  that  historical  record  only  in  a  statistical  sense  even  if  the  underly¬ 
ing  probability  distributions  and  cross  correlations  do  not  change. 

In  using  any  historical  sequence  directly  in  planning  or  design,  one  runs  the 
risk  that  the  sequence  chosen  may  be  too  mild,  too  severe,  too  bland,  too  errat¬ 
ic,  or  in  some  way  unrepresentative  of  the  future.  When  a  single  historical 
weather  record  is  used  alone  for  weapon  system  or  facility  design  purposes,  or 
for  war  planning,  the  designer  or  planner  is  given  only  limited  information  about 
the  risk  involved  in  adopting  or  implementing  a  specific  plan  or  design.  Ques¬ 
tions  such  as  "What  is  likelihood  that  this  design  will  survive  for  25  years?"  or 
"What  is  the  probability  that  this  weapon  system  will  be  95  percent  available  in 
winter?"  go  unanswered  unless  some  means  are  available  for  the  designer  or  plan¬ 
ner  to  exercise  his  design  or  plan  against  a  sufficient  number  of  independent 
25-year,  season-length,  or  war-length  weather  sequences. 
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If  one  uses  a  simulation  model  rather  than  the  historical  weather  record 
directly  some  of  these  problems  can  be  circumvented.  This  model  will  produce 
synthetic  observations  of  ceiling  and  visibility  on  call.  The  user  controls  the 
length  of  the  time  series  generated.  The  synthetic  weather  possesses  a  variabil¬ 
ity  not  unlike  that  of  real  weather  data.  The  model  may  be  run  repeatedly  to 
acquire  risk  statistics  needed  by  the  designer  or  planner. 

It  should  be  emphasized  here  that  this  weather  simulation  model  did  consider 
the  historical  weather  record.  Models  do  not  replace  data.  Historical  weather 
data  was  used  to  condition  regression  coefficients,  probability  distributions, 
and  correlation  parameters  required  as  run  parameters  for  the  model.  In  other 
words,  indirect  use  of  the  historical  weather  record  is  made  by  using  this  simu¬ 
lation  model  in  a  study.  The  distinction  is  between  direct  use  of  the  historical 
weather  in  a  study  and  indirect  use  of  that  record  in  supporting  the  development 
of  this  model  (see  figure  2-1).  NOTE:  This  model  will  only  represent  weather 
occurrences  statistically.  In  the  long  run,  this  model  will  duplicate  the  uncon¬ 
ditional  probability  distributions  as  defined  by  the  input  Weibull  modeling  coef¬ 
ficients  and  reproduce  the  specified  spatial,  temporal,  and  cross-variable 
correlations.  However,  individual  fields  may  lack  soundness  from  a  purely 
physical/meteorological  point  of  view.  If  a  prospective  user  needs  specific 
weather  patterns  to  be  reproduced  then  he/she  must  use  the  historical  weather 
record. 


TO  SUPPORT  WEAPONS  AND  WARFARE  DESI6N  STUDIES 


OIRECT  USE  OF  HISTORICAL 
NEITHER  RECORD 


IMOIRECT  USE  OF 
HISTORICAL  NEITHER 
RECORD 


Figure  2-1.  Direct  Use  vs.  Indirect  Use  of 
Weather  Data  to  Support  Studies. 

2.1.4  Ceiling/Visibility  Simulation  Model  Requirements.  The  requirements  stipu¬ 
lated  for  WEASIM  were  originally  stated  in  the  HQ  USAF/SA-USAFETAC  Statement  of 
Work  and  are  summarized  below. 

"USAFETAC  shall  develop  a  mathematical  simulator  of  variations  in  ceiling  and 
visibility  ifi  the  European  theatre.  This  weather  simulator  shall  provide,  on 


demand  of  a  host  program,  realistic  time  series  of  ceiling  and  visibility  condi¬ 
tions  for  multiple  locations.  These  synthetic  observations  must  be  consistent 
with  each  other  as  functions  of  space  and  time.  The  mathematical  functions  used 
to  generate  observations  of  ceiling  and  visiblity  will  be  continuous  in  two  spa¬ 
tial  (horizontal)  dimensions  and  one  temporal  dimension  so  as  to  provide  informa¬ 
tion  at  resolutions  needed  by  the  parent  military  operations  simulator.  This  is 
understood  to  mean  information  at  discrete  locations  or  grid  points  consistent 
with  the  original  weather  record  data  base." 

The  basic  methodology  and  mathematics  of  the  model  are  covered  in  the  next 
paragraphs  of  this  technical  note  and  a  description  on  how  well  WEASIM  meets  the 
stated  requirements  is  presented  in  chapter  4. 

2.2  Detailed  Methodology 

2.2.1  The  Basic  Model.  The  basic  component  of  the  Ceiling/Visibility  Simulation 
model  is  based  on  an  Ornstein-Uhlenbeck  stochastic  process.  This  single-variable 
single  location  submodel  is  an  autoregressive,  first-order  Markov  process  in 
which  each  value  of  a  random  variable  Xt  is  taken  to  be  a  particular  value  of  a 
stationary  stochastic  process.  It  is  a  common  and  usually  justifiable  assumption 
to  treat  weather  variables  as  a  first-order  Markov  process  (see  Whiton  and 
Berecek,  1982).  The  Ornstein-Uhlenbeck  process  is  well  based  in  the  statistical 
literature.  It  can  be  applied  with  substantial  justification  to  variables  whose 
time  series  have  a  random  component  and  approximately  adhere  to  the  first-order 
Markov  restriction. 


The  generation  of  a  time  series  of  a  single  meteorological  variable  would  be 
quite  simple  if  each  value  in  the  time  sequence  were  independent  of  all  others. 
In  general,  this  is  not  the  case.  Whether  successive  meteorological  observations 
are  independent,  depends  on  the  time  separation  between  them.  The  common  separa¬ 
tion  between  surface  meteorological  observations  is  1,  3  or  6  hours.  At  these 
separations,  successive  observations  of  most  meteorological  variables  are  not 
serially  independent.  The  goal  of  this  simulation  model  is  to  reproduce  this 
serial  dependence  between  successive  values  of  ceiling  and  visibility,  as  well  as 
to  reproduce  their  unconditional  probability  distributions. 

Assume  that  the  variable  to  be  simulated  (X)  is  normally  distributed.  If  a 
variate  is  not  normally  distributed,  it  can  be  transformed  to  the  normal  distri¬ 
bution  by  expressing  the  values  of  the  raw  variable  in  terms  of  its  equivalent 
normal  deviate  (END).  Transformation  of  variables  to  the  normal  distribution  is 
covered  in  detail  in  Boehm  (1976)  and  is  summarized  in  paragraph  2.2.2.  For  the 
purposes  of  this  example,  random  samples  of  X  at  times  t  and  t+1  shall  be  con¬ 
sidered  two  separate  and  distinct  variables.  The  joint  normal  density  function 
of  the  two  variables  Xfc  and  Xt+1,  with  mean  m»  variance  o2,  and  serial  correla¬ 
tion  p  between  successive  values  is, 


fxtxt+l<xt'xt+1) 


1  (  Xt-M  >  2  -2p  ( Xt-p  >  (  Xt+1-M  )  ■*•  (  Xfc  j-p  )  2 

-  exp[— E - i - Hli - £±i - )  (2-1) 

2no2 (1-p2  )l /2  2o2 ( 1-p*  ) 

So  the  joint  normal  probability  of  two  random  variables  with  the  same  mean  and 
variance  depends  only  on  p,  a2,  and  their  correlation  p.  The  generation  of  a 
time  series  of  observations  then  requires  the  conditional  distribution  of  the 
weather  variable  at  one  time  given  the  value  of  the  variable  in  previous  hours. 
If  the  weather  process  approximates  a  first-ordei  Markov  process,  the  dependence 
of  the  distribution  at  time  t+1  on  the  distribution  of  the  variable  in  previous 
hours  is  summarized  by  the  value  of  the  variable  at  time  t.  If  successive  ob¬ 
servations  of  this  arbitary  variable  have  a  multivariate  normal  distribution,  the 
conditional  distribution  of  Xt+1  is  normal  with  mean  and  variance  equal  to, 

ElXt+l,Xt=XtJ  =  M  +  P(*t-M)  (2-2) 

Var[Xt+1!Xt=xt)  =  o2(l-p2)  (2-3) 

where  xt  is  the  value  of  Xt  at  hour  t  (see  figure  2-2).  From  equation  (2-3)  it 
can  be  seen  that  the  larger  the  absolute  value  of  the  serial  correlation  p  be¬ 
tween  the  values  of  the  variable,  the  smaller  the  conditional  variance  of  Xt+1, 
which  does  not  depend  at  all  on  the  value  of  xt. 

A  time  series  of  synthetic,  normally  distributed  variables  with  mean  p, 
variance  a2,  and  a  serial  correlation  p  is  produced  by  the  equation, 

Xt+1  =  M  *  p(Xt  “  p)  +  °  (2-4) 

where  nt  is  a  standard  normal  random  number,  i.e.,  a  number  drawn  at  random  from 
a  population  with  a  mean  of  zero  and  a  variance  of  unity,  abbreviated  as  N(0,1). 
Each  nt  is  totally  independent  of  past  values  of  n  as  well  as  past  values  of  X. 
If  the  variable  being  simulated  is  expressed  as  an  END  (which  itself  is  distri¬ 
buted  N( 0 , 1 ) ) ,  then  equation  (2-4)  simplifies  to, 

Xt+1  =  PXt  +  (2-5) 

(a)  (b) 

which  is  an  Ornstein-Uhlenbeck  stochastic  process  in  two  parts,  a  deterministic 
part  (a)  and  a  random  or  stochastic  part  (b)  expressing  the  uncertainty  in  the 
random  process.  X^^^  will  have  a  normal  distribution  if  both  Xt  and  ot  ate  norm¬ 
ally  distributed  because  the  central  limit  theorem  states  the  sums  of  independ¬ 
ent,  normally  distributed  random  variables  are  normally  distributed.  In  the  case 
of  independence  between  successive  X  values,  where  p  3  0,  the  deterministic  part 
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Figure  2-2.  Conditional  Distribution  of  Xfc+1  Given  Xt  =  xt- 


(a)  is  weighted  by  V5*  =  0,  and  the  stochastic  part  (b)  is  weighted  by  VJ*  *  1#  S( 
successive  values  of  X  are  fully  random.  When  complete  positive  dependence 
between  successive  values  of  X,  where  p  =  1,  the  deterministic  part  is  fully  in 
control,  and  each  succeeding  Xfc+1  is  identical  to  its  predecessor  Xfc. 

Correlation  in  the  intermediate  case,  where  0  <  p  <  1,  can  be  seen  by 
recalling  the  definition  of  the  Pearson  product  moment  correlation  coefficient: 


2 


vZ(u-u)J  V2(v-v)2 


which  can  be  rewritten  as. 


1  2[(u-u)(v-v) 

N  Bu  6v 


(2-7) 


2uv  -  Nuv 


(2-8) 


where  the  bars  represent  means  or  expected  values,  and  s  represents  the  standard 
deviation,  the  square  root  of  the  variance. 

In  applying  equation  (2-8)  to  equation  (2-5)  for  u  =  Xt+1  and  v  =■  Xfc,  one 
finds  that  for  standard  normally  distributed  X, 


*t  =° 


Xt+1  =  0 


8X  =  1 

*t 


BX  =1 
*t+l  1 


r  =  s  1  xt+i  xt 


(2-9) 


If  the  definition  of  Xt+1  from  equation  (2-5)  is  used,  then  by  substitution, 
equation  (2-9)  becomes. 


_ . 

r  =  p  -  +  V  1-P‘ 

N 


*  <nt  xt) 


r  =  pE(Xt*J  +  E[ntxt] 


(2-10) 


where  E  represents  the  expected  value  or  mean.  Since  Xt  is  perfectly  correlated 
with  itself, 


E[Xt2)  =  1 
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Furthermore,  since  nt  end  Xt  are  independent  of  each  other 


and. 


B[ntXt]  »  0 


r  *  p 


(2-11) 


for  the  Ornstein-Uhlenbeck  model. 

The  question  remains,  does  the  time  series  model  defined  by  equation  (2-4) 
reproduce  a  population  with  a  specified  mean  and  variance?  The  conditional  mean 
of  Xt+1,  given  that  Xt  equals  xt  is, 

■IWV'tl  =  EIm  +  p(*t“p)  +  nt]  (2-12) 

Since  E(ot)  =  0,  equation  (2-12)  reduces  to, 

E[Xt+1lxt]  =  M  ♦  P(xt-p) 

which  is  the  conditional  mean  specified  by  equation  (2-2).  The  conditional  vari¬ 
ance  of  Xt+1  produced  by  equation  (2-4)  is  (NOTE:  The  variance  of  X  is  given  by 
Var(X)  =  <ox)2  =  (l/N)l(X-px)2  *  E[(X  -  E[X)>2]), 

Var[Xt+1lxt]  =  E[{Xt+1  -  E[Xt+1 |xt)} « |xtJ 

*  e[{m  +  p(xt  -  m)  +  WT=p7nt  -  Iv  *  p(*t 

=  ElaVl-p*nt] 2  (2-13) 

Since  expected  variance  of  nt  is  equal  to  1  (E[Var(nt)l  =1), 

Var[Xt+1|xt]  *  o2(l-p2) 

which  is  equation  (2-3).  Thus  the  model  produces  distributions  with  the  correct 
conditional  mean  and  variance.  The  unconditional  mean  of  equals 

E[Xt+i]  =  M  +  P  (E[Xt]  -p)  +  E[ntJo  Vl-Pa  (2-14) 

Once  again  noting  that  the  mean  of  f)t  =  0  and  that  the  distribution  of  the  vari¬ 
able  is  independent  of  time  so  that  for  all  t,  E[Xt+1)  =  E[Xt]  =  E[X],  it  is 
clear  that, 


The  unconditional  variance  of  the  variable  X  is,  using  equation  (2-4), 

El(Xt+1  -  M)2]  ■  E[{p(Xt-M)  ♦  oVl=P?nt}2l  (2-17) 

=  p2E((Xt-M)2]  +  2poVlrp?  E((Xt-p)nt] 

+  o2(l-p2)  E(ri|] 

Since  each  value  of  nt  is  independent,  then  E[ixt-p)nt]  =  0  and  E(nt2]  *  1. 
Therefore,  using  the  fact  that  E[(Xt+1-p)2]  =  E[(Xt-p)2  =  E[(X-p)2],  the  uncon¬ 
ditional  variance  of  all  X  satisfies  the  equation, 

(1-P2)  E( (X-p )2 J  =  (l-p2)o2  (2-18) 

The  conditional  mean  of  does  not  depend  on  the  assumption  that  the  ran¬ 

dom  variables  Xt  and  nt  are  normally  distributed.  This  relationship  applies  to 
all  autoregressive  Markov  processes  in  the  form  of  equation  (2-4),  regardless  of 
the  distributions  of  Xfc  and  nt-  However,  if  the  variable  Xt  at  time  t  is  normal¬ 
ly  distributed  with  mean  p  and  variance  a2,  and  if  the  nt  values  are  independent¬ 
ly  normally  distributed  with  a  mean  of  0  and  variance  of  1,  then  the  generated 
X's  for  t  >_  1  will  also  be  normally  distributed  with  mean  p  and  variance  o2. 

2.2.2  Transformation  to  the  Normal  Distribution.  To  apply  equation  (2-5)  to  a 
weather  variable  that  is  not  normally  distributed,  one  must  first  transform  the 
nonnormal  variable  to  its  END  8.  The  transformation  to  the  normal  distribution 
is  referred  to  as  transnormalization  and  is  pictured  graphically  in  figure  2-3. 
This  figure  portrays  the  empirically  determined  cumulative  frequency  distribution 
of  the  ceiling  at  Loring  AFB,  ME,  for  January  at  1200  Local  Standard  Time  ( LST ) , 
obtained  from  an  historical  weather  tabulation  called  the  Revised  Uniform  Summary 
of  Surface  Weather  Observations  (RUSSWO).  Figure  2-3  actually  shows  an  empirical 
estimate  of  the  cumulative  probability  Pr(C  <  cT)  of  the  cloud  ceiling  at  Loring 
AFB,  ME,  1200  LST,  January,  where  C  represents  the  ceiling  in  feet  and  cT  is  some 
threshold  value  of  the  ceiling  in  feet.  In  the  example  shown,  the  probability 
that  C  is  less  than  cT  =  5,000  feet  is  0.351. 

Pr(C  <  cT)  =  0.351 

In  the  context  of  the  normal  probability  distribution,  this  probability  corres¬ 
ponds  to  some  END  tt.  In  other  words,  the  integral  of  the  standard  normal  density 
function  4(u)  from  u  =  -»  to  u  =  H  is  Pr(C  <  cT),  where 
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Figure  2-3.  Cumulative  Distribution  Function  (CDF) 
of  the  Ceiling  at  Loring  AFB,  ME,  for  January  at 
1200  LST.  CDF  extracted  from  the  Loring  AFB  RUSS WO. 


0(u)  *  exp(-uV2)  (2-19) 

V  2n 


*(cT)  =  Pr{C  <  cT}  =  /  *(u)  du 


(2-20) 


The  probability  Pr(C  <  cT)  is  actually  the  area  under  the  standard  normal  curve 
from  -»  to  C,  as  shown  in  figure  2-4.  Rational  approximations  to  the  integral  of 
the  normal  probability  distribution  are  available  for  working  on  a  computer  and 
show  that  a  probability  of  0.351  corresponds  to  an  END,  H  =  -0.381. 

Thus  for  1200  LST  in  January  at  Loring  AFB,  using  RUSSWO  data,  a  ceiling  of 
5,000  feet  corresponds  to  an  END  of  -0.381.  The  RUSSWO  is  only  an  approximation 
to  reality;  therefore,  it  is  better  to  say  that  5,000  feet  corresponds  approx¬ 
imately  to  an  END  of  -0.381.  Approximate  transformations  are  given  in  table  2-1. 

Using  the  normal  transformation,  then,  every  ceiling  corresponds  to  an  END  of 
that  ceiling.  Since  ENDs  are  in  themselves  normally  distributed  with  a  mean  of 
zero  and  variance  of  one,  they  can  be  used  as  random  variables  in  the  Ornstein- 
Uhlenbeck  process,  equation  (2-5).  Such  a  process,  applied  to  the  ceiling  (not 
normally  distributed  in  general)  is 
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Figure  2-4.  Normal  Probability  Distribution, 
Integrated  from  -»  to  6  =  -0.381,  Yields  a 
Cumulative  Probability  Pr  =  0.351. 


Table  2-1.  Transnormalization 
ME,  January,  1200  LST. 

from  Ceiling  to  END 

Cumulative 

Probability 

for  Loring 

Ceiling  (ft) 

C  <  cT 

END 

0 

0.000 

•00 

200 

0.009 

-2.295 

1,000 

0.113 

-1.209 

2,000 

0.203 

-0.833 

3,000 

0.269 

-0.612 

5,000 

0.351 

-0.381 

10,000 

0.418 

-0.205 

20,000 

0.489 

-0.032 

ii  n 

ct+l  =  pcc  ct 

+  ^^7  'It 

where  c  values  are  ENDS  of  the  ceiling  C. 


2.2.3  Simulation  of  the  Cloud  Ceiling.  To  see  how  such  a  simulation  works  in 
practice,  consider  a  case  with  an  initial  ceiling  at  2,000  feet  (the  correspond¬ 
ing  END  d  is  -0.833).  Assume  a  serial  correlation  according  to  the  exponential 
decay  model, 


where  At  is  the  time  step,  unity  in  this  case.  We  generate  a  random  normal 
number,  for  example  nt  =  0.325.  Applying  the  Ornstein-Uhlenbeck  process  in 
equation  (2-21)  yields 

ct+i  =  (0.945) (-0.833)  ♦  (/T --"0.945«! )  (0.325) 

=  (0.945)(-0.833)  +  (0.327)(0.325) 


-0.787 


-0.681 


0.106 


which  corresponds  to  a  ceiling  of  about  2,300  feet.  At  the  next  time  step,  (*t 
becomes  -0.681.  Another  random  normal  number  is  drawn,  say  -0.102.  Then 

ct+1  =  (0.945 ) (-0 . 681 )  +  (0.327 ) (-0. 102 ) 


-0 . 644 
-0.677 


0.033 


which  again  corresponds  to  a  ceiling  of  about  2,300  feet. 

If  continued,  this  process  will  generate  a  time  series  of  the  ceiling  whose 
probability  distribution  is  the  same  as  the  distribution  specified  initially 
(e.g.,  figure  2-3),  within  the  limits  imposed  by  sampling  error.  The  process 
will  not  necessarily  produce  the  same  durations  as  those  of  the  original  data. 
The  distribution  of  durations  of,  for  example,  low  ceiling  episodes  is  affected 
by  the  parameter  pcc  and  by  the  first  order  Markov  assumption.  It  is  possible  to 
determine  a  value  of  pcc  that  will  best  "fit"  a  given  distribution  of  durations. 

2.2.4  Expanding  the  Basic  Model  to  Two  Variables.  The  simulation  submodel 
expressed  in  equation  (2-5)  is  severely  limited.  It  can  be  applied  only  to  a 
time  series  of  a  single  variable,  such  as  ceiling.  The  purpose  of  WEASIM  is  to 
simulate  more  than  one  variable  (ceiling  and  visibility)  while  preserving  the 
cross-correlation  between  them.  WEASIM  handles  the  two  variable  case  by  includ¬ 
ing  two  time  series  of  ENDs.  One  END  for  each  of  the  two  variables,  and  then 
carrying  the  cross-correlation  information  in  the  stochastic  part  of  the  solu¬ 
tion.  Specifically  there  is  an  END  for  the  ceiling,  ft,  and  an  END  for  the  visi¬ 
bility,  V.  These  advance  by  separate  Ornstein-Uhlenbeck  equations: 


pcc  ct  +  VI=p^» 

=  Pw  vt  +  n, 


(2-21) 


(2-23) 


But  because  it  is  desired  to  produce  time  series  of  ceiling  and  visibility  that 
are  correlated  across  variables  (i.e.,  cross-correlated),  the  stochastic  parts  of 


equations  (2-21)  and  (2-23)  must  be  linked.  This  is  done  by  generating  a  random 
normal  number  of  visibility  nv  that  is  correlated  with  that,  nc*  previously  gen¬ 
erated  for  ceiling.  To  do  this,  the  procedure  is  first  to  generate  an  independ¬ 
ent  nc  and  then  to  set 

nv  =  (‘cv  nc  +  n  <2-24) 

where  o  is  another  independent  random  normal  number,  and  p^v  is  proportional  to 

the  cross-correlation  between  ENDs  of  ceiling  and  visibility.  Equation  (2-24)  is 

essentially  the  generation  algorithm  for  producing  ENDs  having  the  correlation 

p  *  . 

Kcv 

In  the  case  of  independence  when  p^v  =  0,  nv  =  n  and  equations  (2-21)  and 
(2-23)  generate  unrelated  time  series  of  ceiling  and  visibility.  In  the  case  of 
perfect  positive  correlation,  when  p^  =  1, 

%  =  ^c 

the  time  series  for  visibility  will  depend  completely  on  that  for  the  ceiling. 
If  pcc  and  p^  =  1,  the  two  time  series  will  be  identical  except  for  a  shift  due 
to  differing  initial  values.  In  the  intermediate  case,  ceiling  and  visibility 
will  be  partially  correlated  according  to  the  value  of  p^v,  which  is  proportional 
to  the  correlation  pcv  between  ceiling  4nd  visibility.  The  process  is  depicted 
in  figure  2-5. 


Figure  2-5.  The  Simulation  Process  for  the  Two 
Variable  Case. 


2-13 


The  serial  correlation  pcc  between  ceiling  at  t  and  ceiling  at  t+1  is  pre¬ 
served,  as  is  the  correlation  pw  between  visibility  at  t  and  visibility  at  t+1. 
The  correlation  pcv  between  ceiling  and  visibility  at  the  same  time  is  propor¬ 
tional  to  p'  through  a  constant  of  proportionality  f. 


It  is  instructive  to  consider  how  the  cross-correlation  pcv  between  ceiling 
and  visibility  relates  to  p^.y.  Using  equations  (2-21),  (2-23),  and  (2-24)  in 
equation  (2-9)  produces  the  equation. 


PCV  N  z  ^pcc  ct  +  ^-pccz  1c1 

(Pw  +  (piv  nc  +  ^"pcvz 

which  expands  to, 

II  II 

pcv  -  z  pcc  pcv  ct  vt 

+  N  Z  pcc  pcv  ^-pwZ  r|cct 
+  b  1  pcc  J1-pw^  VT~~pc7  ^t 
+  N  Z  pw  vl‘pc7  pcvt 

+  s  z  piv  ^7  ^7  nc2 
+  k  i  ^7  ncn 


(2-25) 


(2-26) 


II  II  II 

Because  E(ncct]  =  E[nct)  =  E[ncvt]  =  E(ncn)  =  o,  due  to  independence,  and 
because  E(nc2J  =  1  by  definition  of  a  N(0,1)  variable. 


cv 


cc 
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(2-27) 


But, 


Z  Ct  Vt  = 


cv 


so. 


cv 


pcc  pw  pcv 


+  p '  Jl-0  2 
wcv  v  ^cc 
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(2-28) 


or 
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(2-29) 
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cv 


J 1-p  2  JT-p  2 

N  Kcc  *  Kw 

1-p  p 

Kcc  Kw 


p 


cv 


From  equation  (2-29)  it  can  be  seen  that  to  obtain  a  correlation  pcv  between 
ceiling  and  visibility,  it  is  necessary  to  use  a  model  correlation  parameter  p* 

CV 

given  by 


l-o  p 

,  _  _ Hcc  w 

Pcv  VT^p-?  ,TTp “a 


p  cv  ^p  cv 


(2-30) 


cc  w 

The  factor  f  reduces  to  1  when  Pcc  =  P^  but  otherwise  is  greater  than  1,  so 


p  ’  >  p 

Krv  —  K 


CV  —  'cv 


This  is  illustrated  in  figure  2-6,  in  which  it  is  desired  to  obtain  pcy  =  0.3. 
The  p^.v  necessary  to  obtain  that  pcv  value  is  0.3  if  pcc  =  pcv.  Where  pcc  f  pw, 
the  p'  needed  to  obtain  p  =  0.3  is  greater  than  0.3.  For  example  if  p  =  0.8 

wV  CC 

and  pw  =  0.4  then  =  0.8,  p2  =  0.4,  p^v  =  0.37.  The  ratio  f  =  P^v/Pcv  can  be 

quite  large  for  cases  where  pcc  and  pw  differ  substantially.  Figure  2-7  gives 

the  factor  f  as  a  function  of  p„„  and  p„„. 

cc  cv 


e, 

Figure  2-6.  Values  of  p^,v  for  pcv  =  0.3. 
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Figure  2-7.  Values  of  f. 


It  can  be  seen  from  equation  (2-24)  that  real  solutions  can  be  obtained  only  if 


pcv  *s  less  or  e<Jual  to  unity.  Hence, 


f  p  <1 
Kcv  - 


p  <  i 

Kcv  -  f 


(2-31) 


The  mathematics  imposes  an  upper  limit  on  the  cross-correlation  this  model  is 
capable  of  producing  between  ceiling  and  visibility.  The  above  example,  in  which 
Pcc  =  0.8  and  pcy  =  0.4,  f  =  1.24  and  pcv  cannot  exceed  0.81.  In  this  case,  the 
model  in  its  present  form  cannot  simulate  phenomena  "c"  and  "v"  whose  ENDs  are 
cross-correlated  more  strongly  than  0.81.  This  upper  limit  on  p^  depends  on 
p cc  and  Pw  and  must  be  treated  on  a  case-by-case  basis.  In  the  special  case 
where  pcv  =  p^v,  cross-correlation  values  up  to  1.0  can  be  simulated. 

The  two  variable,  single  station  version  of  the  model  does  not  explicitly 
preserve  what  is  known  as  the  cross-lag  correlation,  such  as  Pvtct+i  >  th®  corre¬ 
lation  between  the  visibility  at  time  t  and  the  ceiling  at  time  t+1.  This  can  be 
seen  by  applying  equation  (2-8)  to  equations  (2-21)  and  (2-23). 


(2-32) 
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Because  vt  and  nc  are  independent,  the  final  term  is  zero,  and 


pct+1vt  ~  pcc  pcv 


(2-33) 


In  the  two  variable,  single  station  version  of  the  model,  the  cross-lag  correla¬ 
tion  reduces  to  the  product  of  autocorrelation  of  the  ceiling  jnd  the  cross¬ 
correlation  of  the  ceiling  and  visibility.  This  is  equivalent  to  saying  that  in 
this  model  the  cross-lag  correlation  reduces  to  the  automatic  correlation  between 
ceiling  and  visibility. 


Panofsky  and  Brier  (1968)  describe  automatic  correlation  in  terms  of  two 
variables  p  and  q  that  are  each  separately  correlated  with  a  third  variable  s. 
The  cross-correlations  are 


and  p_„ 
ps  sq 


The  separate  correlations  between  s  and  p  and  between  s  and  q  guarantee  an 
"automatic”  correlation  between  p  and  q  even  if  the  two  are  not  intrinsically 
related.  The  automatic  correlation  would  be  the  product  ppg  ps<j. 

Application  to  the  two  variable  O-U  model  is  shown  in  figure  2-8.  is  cor¬ 
related  with  ®t+1  by  autocorrelation  pC(,.  Ct  is  correlated  with  by  cross¬ 
correlation  pcv.  This  guarantees  an  automatic  correlation  of  (Pcc)  *  (pcv) 
between  and  The  model  cross-lag  correlation  given  by  equation  (2-33)  is 
the  automatic  correlation.  In  the  two  variable  O-U  model  the  cross-lag  correla¬ 
tion  reduces  to  the  automatic  correlation. 


Whether  this  is  true  in  nature  is  another  question.  A  model  is  a  simplifica¬ 
tion  or  generalization  of  nature.  Work  conducted  to  date  gives  no  indication 
that  reducing  the  cross-lag  correlation  to  the  automatic  correlation  has  any 
adverse  affect  on  the  model  as  a  weather  simulator.  Major  A1  Boehm,  USAFETAC/DNM, 
who  first  applied  the  O-U  model  to  simulate  meteorological  variables  believes 
that  cross-lag  correlations  between  ceiling  and  visibility  are  very  nearly  equal 
to  automatic  correlation.  Whether  this  is  true  or  at  least  approximately  true 
for  other  variables  is  subject  to  verification  using  actual  data. 


Figure  2-8.  Correlation  Influence  Diagram  for  the 
Two  Variable  0-U  Model.  The  cross-lag  correlation 
is  shown  as  a  dotted  line  because  it  reduces  to 
automatic  correlation  in  this  model. 

2.2.5  Modeling  Cumulative  Distribution  Functions 

2. 2. 5.1  Tabular  Approach.  The  procedure  described  previously  in  paragraph  2.2.2 
involved  using  RUSSWOs  or  tabulated  data  to  construct  tables,  such  as  table  2-1, 
relating  ceiling  heights  to  their  ENDs.  This  can  be  accomplished  using  any  cumu¬ 
lative  distribution. 

There  are  several  important  problems  with  this  approach.  First,  any  table  is 
discrete.  The  model  generates  continuous  END  values  such  as  C  =  -0.441234. 
Table  2-1  contains  no  such  value.  Interpolation,  which  introduces  error,  would 
be  required  to  translate  such  an  END  into  its  corresponding  ceiling  height. 
Secondly,  surface  weather  observations  contain  errors  and  biases  introduced  by 
the  weather  observing  system.  These  often  show  up  as  bumps  or  spikes  in  the 
relative  frequencies,  corresponding  to  reportable  values,  popular  valu<  1oca- 
tion  of  visibility  markers,  etc.  Finally,  from  the  point  of  view  of  simulation, 
it  is  inefficient  to  maintain  in  computer  storage  entire  RUSSWOs  from  which  to 
interpolate  probabilities. 

2. 2. 5. 2  Distribution  Fitting  Approach.  A  flexible  and  very  powerful  alternative 
to  storing  RUSSWOs  is  to  model  the  RUSSWO  probabilities,  using  regression  tech¬ 
niques  to  fit  variously  shaped  probability  distributions  or  curves  to  RUSSWO 
data.  The  result  of  this  is  a  continuous  function  of  the  form, 


from  which  continuous  probability  estimates  can  be  obtained  simply  by  evaluating 
the  function.  More  importantly,  continuous  variable  estimates  can  be  obtained  by 
evaluating  the  function  inverse: 

x  =  F~ 1 (P )  (2-35) 

Considerable  work  of  this  sort  has  been  done  by  Somerville  and  Bean  (1979)  of 
the  University  of  Central  Florida  and  by  USAFETAC. 

Fitting  of  the  observed  distributions  of  ceiling  and  visibility  to  theoretical 
distributions  is  discussed  in  greater  detail  in  paragraph  3.2  of  this  technical 
note.  This  model  uses  the  Weibull  curve  for  modeling  visibility  and  a  variant  of 
the  Weibull,  called  the  Reverse  Weibull  for  ceiling. 

2.2.6  Expanding  the  Two  Variable,  Single  Station  Case  to  Multiple  Stations  with 


the  Sawtooth 

Wave  Submodel . 

The  0-U 

equations 
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expressed  in 

equations  (2- 
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In  the  same  manner  as  in  the  single  station  two  variable  model,  desired  correla¬ 
tion  can  be  incorporated  into  the  stochastic  portion  of  the  equations.  If  the 
random  normal  number  fields  gc  and  Qy  have  a  spatial  structure  or  spatial  cor¬ 
relation  function  then  the  resultant  END  fields  for  ceiling  and  visibility  will 
display  the  same  type  of  spatial  correlation  structure. 

while  this  can  be  done  by  a  number  of  techniques,  one  that  is  well  suited  to  a 
simulation  with  a  large  number  of  locations  is  a  two-dimensional  field  simulation 
model  based  on  a  sawtooth  wave  submodel  developed  by  Maj  Albert  Boehm,  USAFETAC/ 
DNM,  for  USAFETAC  Project  1960,  Colossus  Weather  Simulation.  The  sawtooth  wave 
model  is  used  to  generate  the  two-dimensional,  spatially  correlated  fields  of 
random  normal  numbers.  These  random  normal  numbers  distributed  N(0,1)  are  then 
used  in  the  O-U  equations  to  generate  ENDs  of  ceiling  and  visibility  which  are 
then  transformed  by  the  transnormalizing  functions  to  raw  values  of  those  vari¬ 
ables.  The  sawtooth  wave  submodel  has  been  shown  to  be  quite  flexible  in  opera¬ 
tional  use  and  can  generate  fields  of  random  normal  numbers  whose  spatial 
correlation  decay  function  is  easily  adjustable. 

2.2.7  Spatial  Correlation  Decay  Function  of  the  Random  Normal  Number  Field 
Produced  by  the  Sawtooth  Wave  Submodel.  The  sawtooth  wave  submodel  generates  a 
field  of  ENDs  q  having  a  desired  spatial  correlation  function  r.  Consider  the 


correlation  r  between  values  n j  and  n j+Aj  located  one  grid  distance  Aj  apart 
(figure  2-9).  Repeated  samplings  of  the  value  of  n  at  j  and  at  j+Aj  would  pro¬ 
duce  a  history  of  N  data  pairs  from  which  the  spatial  correlation  could  be  esti¬ 
mated  by  the  Pearson  product  moment  formula, 
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(2-39) 
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Figure  2-9.  Correlation  Between  n  at  Location 
J  =  j  and  n  at  Location  J  =  j  +  Aj  Located  One 
Grid  Distance  Apart. 


or  by  some  other  method.  In  equation  (2-39),  the  overbars  represent  means,  and  s 
represents  the  standard  deviation.  Since  the  n  are  ENDs,  they  are  distributed 
normally  with  a  mean  of  zero  and  a  variance  of  unity.  Therefore,  for  normally 
distributed  n,  equation  (2-39)  reduces  to 


r  = 


N 


N  , t?1  nj,k  nj+Aj,k 


(2-40) 


Spatial  correlation  is  being  dealt  here.  The  correlation  between  o  values 
will  be  perfect  (unity)  at  zero  separation  (Aj  =  0)  and  will  be  less  than  or 
equal  to  unity  with  increasing  distance  Aj.  To  model  the  weather,  a  correlation 


function  is  needed  that  starts  at  unity  and  decreases  with  increasing  distance 
Aj. 

One  such  model  that  has  been  used  successfully  in  ceiling  and  visibility 
modeling  is  that  of  Gringorten  (1979).  In  Gringorten's  Model-B,  the  correlation 
function  r  depends  on  the  geometric  distance  d  and  a  characteristic  scale  dis¬ 
tance  D,  defined  as  the  distance  at  which  the  correlation  r  falls  to  0.99. 
Gringorten's  Model-B  is  defined  by, 

r  =  r(d,D)  =  -  (cos  xo  =  ]  (dimensionless)  (2-41) 

where , 

o  =  d/(128  D)  (dimension!' .... )  (2-42) 

Because  a  =  >  0,  the  trigonometric  relationship  between  arc  cosine  and  arc  sine 
can  be  used,  yielding  the  form, 

r  =  r(d,D)  =  |  (sin‘‘H  -  oH)  (2-43) 

where 


= 


(2-44) 


In  this  correlation  function  model,  when  the  distance  d  equals  the  scale 
distance  D,  a  =  1/128,  H  =  0.99997  ~  1,  sin'll)  =  n/2,  oH  =  0.00781,  and  r  = 
0.99.  Note  that  when  a  =  1,  H  =  0,  and  r  =  0.  Therefore,  Gringorten's  Model-B 
correlation  drops  to  zero  at  a  distance  d  =  128  D.  The  scale  distance  for  visi¬ 
bility  in  Germany  during  April  is  estimated  to  be  4  kilometers.  Using  this  for  D 
in  equations  (2-43)  and  (2-44)  gives  the  correlation  function  shown  in  figure 
2-10.  With  this  scale  distance,  the  correlation  drops  to  0.99  in  4  kilometers 
(2  NM)  and  to  approximately  zero  at  512  kilometers  (276  NM). 


It  is  desired  that  the  sawtooth  wave  model  produce  a  field  of  ENDs  having  the 
spatial  correlation  function  similar  to  Gringorten's  Model-B.  During  his  earlier 
work  with  the  sawtooth  wave  generator,  Boehm  developed  empirical  equations  that 
converted  a  desired  Gringorten  Model-B  scale  distance  (SD)  into  maximum  and  min¬ 
imum  allowable  wavelengths  (the  interval  from  which  the  wavelengths  of  each  saw¬ 
tooth  wave  will  be  selected  at  random)  for  the  sawtooth  wave  generator.  These 
equations  are: 


w  (upper)  =  C(u)  *  SD 


(2-45) 


W  (lower)  =  C(l)  *  SD 


(2-46) 
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Figure  2-10.  Correlation  Function  for 
Gringorten's  Model-B  with  Scale  Distance 
D  =  4  Km. 


The  dimensionless  constants  C(u)  and  C(l)  are  selected  so  the  model  returns  a 
spatial  correlation  function  for  the  random  normal  number  field  that  has  a  shape 
similar  to  a  Gringorten  curve  for  the  particular  SD.  The  wavelengths  will  be  in 
the  same  units  as  the  scale  distance.  The  values  of  175  and  450  were  used  for 
C(l)  and  C(u),  respectively,  in  an  earlier  application  of  the  sawtooth  wave 
submodel  (see  Whiton,  et  al,  1981).  This  previous  application  dealt  with  the 
simulation  of  forecasts  of  total  cloud  cover.  The  characteristic  scale  distance 
for  this  variable  is  usually  greater  than  5  Kilometers.  The  variables  being 
considered  in  WEASIM  (ceiling  and  visibility)  have  much  smaller  characteristic 
scale  distances  (usually  less  than  4  Km).  For  this  application  205  and  560  were 


Vi 
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used  for  C(l)  and  C(u),  respectively.  These  new  constants  provided  much  better 
estimates  of  the  Gringorten  spatial  correlation  decay  function  for  the  shorter 
scale  distances  associated  with  ceiling  and  visibility. 

2.2.8  The  Mathematics  of  the  Sawtooth  Wave.  In  the  sawtooth  wave  model,  N  saw¬ 
tooth  waves  are  allowed  to  emanate  circularly  from  N  focal  points.  Each  focal 
point  is  the  source  of  exactly  one  wave.  The  location  of  each  focal  point  is 
picked  at  random,  and  the  wavelength  of  each  wave  is  selected  at  random  from  a 
range  of  permissible  wavelengths.  The  field  of  ENDS  (q)  is  simply  the  sum  of  N 
sawtooth  wave  amplitudes  at  each  grid  point,  corrected  by  subtraction  of  a 
constant. 

Each  sawtooth  wave  is  as  shown  in  figure  2-11.  Amplitude  of  the  wave,  shown 
by  y,  varies  between  zero  and  one,  depending  on  the  observer's  position  along  the 
wave.  The  sawtooth  wave  used  here  is  a  standing  wave.  Originating  with  zero 
amplitude  at  a  focal  point  at  distance  d'  -  0,  it  reaches  maximum  amplitude 


d’ 

Figure  2-11.  Sawtooth  Wave,  d’  represents  normalized 
distance  orthogonal  to  the  wave  front.  The  normalized 
distance  d'  is  measured  in  unit  wavelengths,  where 
wavelength  is  represented  by  w. 


(unity)  at  distance  d'  =  1  wavelength,  and  thereafter  falls  to  zero  amplitude 
again.  Within  any  one  cycle  of  the  sawtooth  wave,  the  slope  of  wave  amplitude 
versus  distance  is  unity,  i.e., 


dy/dd '  =  1 


(2-47) 


Hence,  within  any  one  cycle  of  the  sawtooth  wave,  its  equation  is 


y  =  d' 


(2-48) 


To  allow  for  multiple  cycles  of  the  sawtooth,  it  is  appropriate  to  write  its 
equation  as 
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y  =  d'  -  INT(d' ) 


(2-49) 


where  INT(d' )  represents  the  largest  integer  less  than  or  equal  to  the  normalized 
distance  d' .  The  normalized  distance  d*  is  the  geometric  distance  d  expressed  in 
unit  wavelengths  w,  i.e., 

d'  =  d  /  w  (2-50) 


Hence , 

y  =  d/w  -  INT(d/w)  (2-51) 

An  alternative  Fourier  representation  of  the  sawtooth  wave  is 

y  =  n  -  2  I  id'  )  (2-52) 

i=l  1 

The  simple  form  of  the  sawtooth  wave  makes  it  easy  to  calculate  the  amplitude 
Yjk  of  a  wave  at  location  j  whose  origin  is  the  focal  point  at  location  k.  This 
is  done  by  computing  the  great  circle  distance  between  locations  j  and  k,  i.e., 

d  =  GCD( j , k)  (2-53) 

and  then  evaluating  equation  (2-51)  with  the  wavelength  w  known. 

But  any  single  wave  amplitude  y^  does  not  create  randomness.  The  g  field 
produced  by  the  sawtooth  wave  model  must  be  random.  Its  elements  n j  must  have 
been  selected  at  random  from  a  normally  distributed  population  with  a  mean  of 
zero  and  a  variance  of  one,  i.e.,  N(0,1).  It  is  apparent  that  the  distribution 
of  any  one  sawtooth  wave  is  uniform,  with  a  mean  of  1/2  and  a  variance  of  1/12. 
But  the  sum  of  approximately  12  uniform  random  numbers,  by  the  central  limit 
theorem,  approaches  the  normal  distribution.  Naylor,  et  al-  (1966)  give  the 
equation  for  calculating  a  normally  distributed  pseudorandom  number  G  from  the 
sum  of  N  uniform  pseudorandom  numbers  U: 


_  N  „ 

G  =  oG  /TJ7H  (  l  Un  -  f)  +  mg  (2-54) 

n=l 

where  oG  and  mg  are  the  desired  standard  deviation  and  mean,  respectively,  of  G. 
For  the  special  case  where  oG  =  1  and  pG  =  0,  and  where  the  number  of  uniform 
random  numbers  to  be  summed  is  N  =  12,  equation  (2-54)  simplifies  to 
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G  =  1  U  -  6  (2-55) 

n=l  n 


Applied  to  the  question  of  calculating  a  normally  distributed  value  Hj  for  loca¬ 
tion  j  from  the  sum  of  N  =  12  uniformly  distributed  sawtooth  wave  amplitudes  y^, 
equation  (2-55)  becomes 
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G  =  2  y,.  -  €  (2-56) 

k=l 

The  superposition  of  sawtooth  waves  is  illustrated  in  figure  2-12.  Two  saw¬ 
tooth  waves  are  shown  emanating  from  randomly  positioned  focal  points  k  =  1  and 
k  =  2.  These  waves  are  converging  on  location  j  with  respective  amplitudes  y^ 
and  y^2.  The  wavelengths  w^  of  the  two  waves  are  illustrated  ceing  different 
to  emphasize  that  those  wavelengths  were  drawn  at  random  uniformly  from  a  range 
of  possible  wavelengths. 


Figure  2-12.  Sawtooth  Waves  Emanating 
from  Focal  Points  at  Locations  k  Converge 
on  Location  j . 


2.2.9  Calculation  of  Great  Circle  Distance.  The  great  circle  distance  d  between 
any  two  points  "a"  and  "b"  on  the  globe  can  be  calculated  from  the  latitude  and 


longitude  of  point  "a"  (ea,  Aa)  and  that  of  point  "b"  (@b,  Afa).  The  conventional 
equation  is 

d  =  r  cos' 1 [sin6a  sin6fa  +  cos6a  cos6b  cos(Aa  -  Afa)]  (2-57) 

where  r  is  the  radius  of  the  earth,  approximately  6371  km.  This  equation  in¬ 
volves  calculating  five  sines  and  cosines  plus  one  arc  cosine. 

An  alternative  expression  for  the  great  circle  distance  d  can  be  obtained  by 
using  the  trigonometric  function-product  relations, 

sin6a  sin6b  =  (V*)(cos(ea  -  6b)  -  cos(6a  +  efa)]  (2-58) 

cosea  coseb  =  (»s)icos(ea  -  eb)  +  cos(ea  +  eb)]  (2-59) 

or 

sin6a  sin6b  =  (H)(d  -  s)  (2-60) 

cos6 a  coseb  =  (S)(d  +  s)  (2-61) 

where 

d  =  cos(6a  -  6b)  (2-62) 

s  =  COS(6a  +  0b)  (2-63) 

from  which  it  is  found  that 

d  =  r  cos' 1 { (H) [ (d  -  s)  +  (d  +  s)cos(Aa  -  Ab ) ] }  (2-64) 

This  equation  calls  for  evaluating  three  cosines  and  one  arc  cosine  and  should 
therefore  be  much  faster  to  solve  than  equation  (2-57). 

A  third  expression  for  the  great  circle  distance  can  be  obtained  by  using  the 
trigonometric  angle-difference  relation, 

cos  (Aa  -  Ab)  =  cosAa  cosAb  +  sinAfl  sinAfa  (2-65) 

in  equation  (2-57),  from  which  it  is  found  that 

d  =  r  cos'* (sine  sine.  +  cose  cose,  (cosA  cosA.  +  sinA,  sinA.  )]  (2-66) 

a  d  a  d  a  d  a  o 

This  equation  involves  eight  sines  and  cosines  plus  one  arc  cosine;  therefore,  it 
appears  much  less  suitable  for  use  than  equations  (2-57)  or  (2-64).  Neverthe- 


less,  equation  (2-66)  offers  some  "operational"  advantages  that  make  it  useful. 
In  particular,  one  need  not  know  the  actual  latitudes  and  longitudes  to  calculate 
great  circle  distance  from  equation  (2-66);  only  the  sines  and  cosines  of  the 
latitudes  and  longitudes  are  needed.  Since  the  number  of  focal  points  is  small 
(generally  12  or  fewer),  the  needed  sines  and  cosines  can  be  calculated  initially 
and  then  stored  for  repeated  use. 

2.2.10  Selection  of  Focal  Points  and  Wavelengths  for  the  Sawtooth  Wave 

2.2.10.1  Selection  of  Focal  Points.  Kach  sawtooth  wave  must  emanate  from  a  ran¬ 
domly  positioned  focal  point;  otherwise,  the  amplitude  sums  will  not  be  random. 
Focal  points  are  located  in  terms  of  their  latitude  (  k  and  longitude  Ak,  where, 
for  convenience,  the  longitude  ranges  from  0  degrees  through  360  degrees.  The 
longitude  Ak  of  the  kth  focal  point  is  selected  uniformly  fr-m  „he  range  0  de¬ 
grees  to  360  degrees  by  the  equation, 

AR  =  360  Uk  (2-67) 

where  Uk  is  a  pseudorandom  number  selected  from  a  population  uniformly  distrib¬ 
uted  over  the  range  (0,1). 

While  the  longitude  Ak  of  the  focal  point  can  be  selected  uniformly  from  the 
range  0  degrees  to  360  degrees,  it  is  not  true  that  the  latitude  6k  can  be 
selected  uniformly  from  the  range  0  degrees  to  180  degrees  (90  degrees  to  -90 
degrees).  This  is  because  eguiprobable  latitude  bands  are  not  equal  area  bands, 
and  simple  selection  of  latitude  would  result  in  an  overly  dense  concentration  of 
focal  points  per  unit  surface  area  near  the  poles .  Figure  2-13  shows  the  geom¬ 
etry  of  this  problem.  Needed  is  an  expression  for  the  surface  area  of  the  spher¬ 
ical  zone  bounded  by  latitudes  0j  and  e2.  The  essential  principle  is  that  the 
surface  area  of  the  zone  is  the  difference  between  the  surface  area  of  the  spher¬ 
ical  cap  formed  by  e2  and  that  formed  by  6 , . 

Consider  only  the  spherical  cap  formed  by  9,  .  This  has  height  h  in  a  sphere 
of  radius  r.  The  surface  area  of  that  cap  is 

Sj  =  2nrh  (2-68) 

But  from  the  Pythagorean  theorem, 

x2  =  r2  -  r2cos20i  =  r2(l  -  cos26i)  =  r2sin2e1  (2-69) 

and 

x  =  r  sinfli  (2-70) 

Moreover, 
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Consider  that  the  width  of  the  latitude  band  8 x  to  e2  will  always  be  constant, 
say  5  degrees  or  10  degrees.  Then  the  sine  of  one-half  their  difference  is  also 
a  constant,  say  D: 

sin  *5(8!  -  82)  =  D  (2-78) 

Thus, 


sinBi  -  sine2  =  2  D  cos  +  e2)  (2-79) 

Using  equation  (2-79)  in  equation  (2-76)  produces  the  result, 


But  r  is  constant,  so 


Sz  =  4nD  r2  cos  Vj( e 2  +  G2) 


(2-80) 


C  =  4nDr2  =  const 


(2-81) 


and. 


8  =  *s(8i  +  82) 


(2-82) 


where  8  is  the  mean  latitude  of  the  zone  bounded  by  8,  and  02 .  Using  equations 
(2-81)  and  (2-82)  in  equation  (2-80)  produces  the  result. 


S  =  C  cos  6  (2-83) 

Equation  (2-83)  shows  that  the  surface  area  of  the  spherical  zone  bounded  by 
latitudes  0!  and  e2  is  proportional  to  the  cosine  of  the  mean  latitude  of  the 
zone.  If  we  simply  choose  the  latitude  of  the  focal  point  uniformly  over  the 
permitted  range  of  latitudes,  then  the  density  of  selections  will  not  show  a 
poleward  decrease  proportional  to  the  poleward  decrease  of  zonal  surface  area  Sz. 
This  can  be  compensated  for  by  selecting  cos  6k  rather  than  6^  itself.  Since  the 
cosine  has  the  range  (0,1],  the  equation  is 

cos  0k  =  Uk'  (2-84) 

where  Uk'  is  a  uniform  pseudorandom  number  drawn  from  the  same  range.  Selection 
of  the  latitude  of  the  focal  point  in  this  way  restricts  the  focal  point  to  the 
Northern  Hemisphere,  but  this  imposes  no  limits  on  the  randomness  of  the  result. 

2.2.10.2  Selection  of  Wavelengths.  If  the  wavelength  wk  of  the  sawtooth  wave 
emanating  from 'location  k  is  to  be  selected  from  the  interval, 

Wj  <  wk  S  w2  (2-85) 
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any  value  is  equally  likely  to  be  chosen,  then  the  selection  can  be  made  by  draw- 
inq  a  random  number  uniformly  from  equation  (2-85).  If  U^"  is  a  pseudorandom 
number  drawn  from  a  uniform  distribution  having  the  range  0  to  1,  then 


wk  *  V(V2  -  Wj)  +  wt 


(2-86) 


The  algorithmic  procedure  for  the  sawtooth  wave  generator  is  shown  in  figure 


procedure  SAWTOO  (m,  ETA); 
integer  j,  k,  m,  n; 
real  w,  d,  y 

real  array  YSUM  [Ism],  ETA[l:m]; 
equivalence  (YSUM,  ETA); 

for  each  location  or  grid  point  j:  =1  step  1  until  m  do 
begin 

initialize  YSUMj:  =  0.0; 
end  j  ; 

for  each  focal  point  k:  *  1  step  1  until  n  do 

comment:  ...  n  =  12  . . . ; 

begin 

select  location  of  k^  focal  point  at  random; 

comment :  ...  Equations  ( 2-67 )  and  ( 2-84 )  . . . ; 
select  wavelength  w  at  random  from  (w1#w2); 

comment;  ...  Equation  (2-86)  — ; 
for  each  location  or  grid  pt  j :  =1  step  1  until  m  do 
begin 

calculate  distance  d:  =  GCD(j,k); 

comment:  ...  Equation  (2-66) 

calculate  wave  amplitude  y; 

comment:  ...  Equation  (2-51)  ...; 

accumulate  YSUMj  =  YSUMj  +  y; 
end  j  ; 
end  k; 

for  each  location  or  grid  pt  j :  =1  step  1  until  m  do 

begin 

ETAj :  =  YSUMj  -  6; 

comment:  ...  Equation  (2-56)  ...; 

end  j; 
end  SAWTOO; 


Figure  2-14.  Algorithm  for  Sawtooth  Wave  Submodel 


The  development  of  the  sawtooth  wave  field  simulation  model  laid  much  of  the 
groundwork  for  the  solution  of  a  class  problem,  the  simulation  of  two-dimension¬ 
al,  spatially  correlated  fields  of  a  desired  variable.  In  this  weather  simula¬ 
tion  the  sawtooth  wave  submodel  is  used  to  produce  END  fields  of  ceiling  and 
visibility  with  spatial  correlations  similar  to  those  found  in  observed  data,  and 
then  the  inverse  normalizing  functions  are  used  to  obtain  synthetic  fields  of  the 
variables . 


Model  Assumptions  and  Limitations 


Simulation  models  are  generalizations  or  simplifications  of  complex  reality. 
To  construct  such  models,  one  usually  must  make  certain  simplifying  assumptions 
that  impose  one  or  more  limitations  on  the  fidelity  with  which  the  model  repro¬ 
duces  the  complexities  of  reality.  These  limitations  can  be  in p^rtant  in  assess¬ 
ing  the  model's  applicability  in  solving  specific  operational  problems.  The 
major  assumptions  and  limitations  of  the  Ceiling/Visibility  Simulation  Model  are 
described  in  these  paragraphs. 


2.3.1  Stochastic  Assumptions.  The  0-U  process  developed  in  paragraph  2.2.1  was 
given  by  equation  (2-5)  and  is: 


Xt+1  =  P  Xt  +  1 

The  ability  of  the  O-U  process  to  reproduce  the  unconditional  distributions  of 
specific  variables  lies  in  the  assumption  that  tha  random  variable  n  has  a  mean 
of  0,  and  a  standard  deviation  of  1.  The  mathematics  of  the  model  depend  on  this 
assumption. 

An  experiment  was  performed  on  the  sawtooth  wave  submodel  to  test  the 
hypothesis  that  the  random  normal  number  fields  generated  by  this  module  indeed 
have  a  mean  of  0  and  a  standard  deviation  of  1.  In  this  experiment  the  sawtooth 
wave  submodel  was  allowed  to  generate  (using  3,  6,  9,  12,  15,  and  18  waves)  1000 
groups  of  1000  independent  random  numbers .  The  mean  and  standard  deviation  of 
each  group  of  numbers  were  calculated,  then  an  analysis  was  performed  on  the 
distribution  of  means  and  standard  deviations  for  all  1000  groups  (see  table 
2-2).  The  95-percent  confidence  intervals  given  in  the  table  were  calculated 
using  the  following  equations  taken  from  Beyer,  1968. 

The  confidence  interval  for  the  mean  of  a  sample  (p)  when  o  is  unknown,  is  given 
by  the  equation: 


Table  2-2.  Results  of  the  Sawtooth  Wave  Submodel  Test.  The  hypothesis  tested 
was  that  the  resultant  fields  of  the  sawtooth  wave  generator  have  a  mean  of  0 
and  a  standard  deviation  of  1. 


Statistical  measures  are  based  on  an  analysis  of  1000  groups  of  1000  random 
numbers  using  SAS*. 


*  OF 


SAWTOOTH 

WAVES 

VARIABLE 

MEAN 

STANDARD 

DEVIATION 

STD  ERROR 
OF  MEAN 

95%  CONFIDENCE  BOUNDS 
LOWER  UPPER 

3 

MEAN 

STD  DEV 

-0.00018654 

0.99950307 

0.0312 

0.0203 

0.00098 

0.00064 

-0.06216735 

0.95798546 

0.06179427 

1.04587269 

6 

MEAN 

STD  DEV 

0.00054176 

0.99999942 

0.0320 

0.0210 

0.00101 

0.00066 

-0.06146983 

0.95822328 

0.06255335 

1.04613209 

9 

MEAN 

STD  DEV 

0.00021389 

0.99954650 

0.0305 

0.0212 

0.00096 

0.00067 

-0.06176964 

0.95800626 

0.06219742 

1.04589558 

12 

MEAN 

STD  DEV 

0.00056509 

0.99954650 

0.0327 

0.0212 

0.00103 

0.00067 

-0.06141844 

0.95800626 

0.06254858 

1.04589558 

15 

MEAN 

STD  DEV 

-0.00147834 

0.99927165 

0.0325 

0.0220 

0.00103 

0.00069 

-0.06344479 

0.95787454 

0.06048814 

1.04575157 

18 

MEAN 

STD  DEV 

0.00144376 

0.99989719 

0.0324 

0.0220 

0.00102 

0.00069 

-0.06056149 

0.95817435 

0.06344897 

1.04607868 

*  STATISTICAL  ANALYSIS  SYSTEM 


X  -  t 


'a/2  fn 


X  ♦  t 


'a/2 


s 

Vn 


(2-87) 


Where  X  is  a  point  estimate  of  the  sample  mean,  s  is  a  point  estimate  of  the 
sample  standard  deviation,  and  *-a/2  value  for  the  Student's  t-Distribution 


at  the  appropriate  level  of  signficance  and  n-1  degrees  of  freedom.  For  a  two- 
tailed  test,  the  Student's  t  at  the  0.025  and  0.975  levels  of  significance  and 
999  degrees  of  freedom  equals  -1.96  and  +1.96. 


The  confidence  interval  for  the  standard  deviation  of  a  sample  (o)  when  a  point 
estimate  s  is  known  is  defined  by: 


[  <n-l)  **  <  0  <  /  (n-D  sa 

*  *2a/2 ; n-1  *  xZl-a/2;n-l 


(2-88) 


Where  x2  is  the  chi-square  distribution  at  the  appropriate  level  of  significance 
and  n-i  degrees  of  freedom.  The  value  for  chi-square  at  a  significance  level  of 


0.975  and  999  degrees  of  freedom  equals  912.834  and  chi-square  at  a  significance 
level  of  0.025  and  999  degrees  of  freedom  is  approximately  1088.007. 


The  results  listed  in  table  2-2  show  that  in  all  cases  the  computed  means  and 
standard  deviations  fell  within  the  95-percent  confidence  interval  of  the  hypoth¬ 
esized  distribution.  Based  on  these  results  you  cannot  reject  the  null  hypothe¬ 
sis  that  the  sawtooth  wave  submodel  produces  random  numbers  with  a  mean  of  zero 
and  a  standard  deviation  of  one. 

2.3.2  Basic  Mathematical  Assumptions 

2. 3. 2.1  Modeling  Observed  Cumulative  Distribution  ^unctions  with  the  Weibull 
Curves.  One  fundamental  assumption  made  in  the  me -.el  was  that  the  Weibull  and 
Reverse  Weibull  curves,  selected  as  the  normalizing  transformation  functions, 
could  adequately  describe  the  cumulative  probability  distributions  of  observed 
ceiling  and  visibility  data.  Using  root  mean  squared  (RMS)  difference  as  an 
indication  of  closeness  of  fit,  it  was  found  that  the  unconditional  cumulative 
probability  distributions  of  ceiling  and  visibility  for  Northern  Europe  could 
ordinarily  be  fit  by  the  Weibull  curves  to  within  3  percent.  Maximum  differences 
were  usually  less  than  6  percent.  This  accuracy  is  usually  adequate  for  most 
user's  needs  considering  that  this  is  well  within  the  accuracy  imposed  by  sam¬ 
pling  error  of  the  observed  distributions  themselves.  A  more  detailed  analysis 
of  the  curve  fits  for  Northern  Europe  is  presented  in  chapter  4. 

2. 3. 2. 2  Inverse  Transnormalization.  In  this  model,  Ornstein-Uhlenbeck  equations 
are  used  to  generate  time  series  of  correlated  ENDs  for  ceiling  and  visibility. 
Although  the  0-U  process  does  not  depend  on  the  assumption  that  the  random  vari¬ 
able  Xt  (see  equation  2-5)  is  normally  distributed,  the  inverse  transnormaliza¬ 
tion  process  does. 

Inverse  transnermalization  is  the  procedure  that  converts  the  ENDs  from  the 
0-U  equation  into  actual  values  of  these  variables.  In  this  procedure  the  uni¬ 
variate  normal  distribution  is  integrated  from  minus  infinity  to  the  given  END. 
The  cumulative  probability  determined  from  this  integration  is  then  converted  to 
a  raw  value,  with  the  appropriate  modeling  function.  Thus,  the  O-U  process  in 
WEASIM  is  assumed  to  produce  a  standard  normal  variable.  That  is,  a  variable 
whose  distribution  is  normal  with  a  mean  of  0  and  a  standard  deviation  of  1.  If 
the  results  of  the  0-U  process  are  not  N(0,1)  the  inverse  transnormalization  pro¬ 
cess  will  fail  to  produce  the  proper  probabilities  and  the  conversion  to  raw 
values  will  fail  to  produce  the  proper  frequency  distributions.  To  test  the 
hypothesis  that  the  distribution  of  variables  produced  by  the  0-U  equations  in 
this  model  is  N(0,1),  the  model  was  allowed  to  generate  2,500  ENDs  for  ceiling 
and  for  visibility.  The  test  was  repeated  using  3,  6,  9,  12,  15,  18  and  21  saw¬ 
tooth  waves  to  generate  the  random  components  of  equations  (2-36)  and  (2-37). 
The  resultant  sample  populations  of  ENDs  were  then  subjected  to  several  standard 
tests  of  normality  (Snedecor  and  Cochran,  1967). 
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The  first  test  performed  was  the  chi-square  goodness  of  fit  test.  In  a  nor¬ 
mal  chi-square  calculation  the  individual  ENDs  for  ceiling  and  visibility  would 
be  grouped  into  classes  to  form  an  observed  frequency  distribution.  The  expected 
frequency  distribution  in  each  class,  for  a  standard  normal  variate,  could  be  ob¬ 
tained  from  tables.  The  test  criterian  would  then  be  calculated  using  the  fol¬ 
lowing  equation: 

J  (  OBS-  -  EXP.  )2 

X2  =  2  1 _ _  (2-89) 

i=l  EXPi 

Where  J  is  the  number  of  intervals,  OBS  is  the  observed  number  of  occurrences  in 
each  interval,  and  EXP  is  the  expected  number  of  occurrences  in  each  interval. 
The  main  problem  with  this  approach  is  that  equation  (2-89)  assumes  that  the 
sample  population  members  are  drawn  independently.  In  the  case  of  the  Ceiling/ 
Visibility  Simulation  Model,  this  is  not  true.  The  model  was  designed  to  pre¬ 
serve  temporal  correlation  and  this  builds  in  a  serial  dependency  among  members 
of  the  sample.  To  apply  some  of  the  basic  statistical  tests  to  the  model 
results,  the  sample  size  must  be  adjusted  to  correct  for  serial  correlation.  One 
such  method  for  correcting  for  the  serial  dependency  in  a  time  series  of  observa¬ 
tions  is  to  use  equation  60  of  the  AWS  Guide  for  Applied  Climatology  (see 
references ) : 

1  -  p 

N’  =  N  (  ]  (2-90) 

Where  N'  is  the  effective  number  of  independent  observations  in  a  sample  of  size 
N  and  p  is  the  serial  correlation  of  the  data.  For  these  tests  the  actual  sample 
size  was  2,500  observations  and  the  serial  correlation  input  to  the  model  was 
0.522.  Substituting  these  values  into  equation  (2-90)  yields: 

N.  =  2500  (  >  =  785 

The  effective  sample  size  was  computed  to  be  785  independent  observations.  Equa¬ 
tion  (2-89)  may  be  rewritten  to  accept  the  input  frequencies  in  another  form, 

J  (OBSF •  •  N  -  EXPF-  •  N)2 

X2  =  1  1 _ L_ _  (2-91) 

i=l  EXPFi  •  N 

Where  N  is  the  total  sample  size,  OBSF^  is  the  observed  relative  frequency  of 
occurrence  in  each  interval  (OBS^/N),  and  EXPF^  is  the  theoretical  relative 
frequency  of  occurrence  in  each  interval  (EXP^/N).  EXPF^  can  also  be  expressed 
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as  the  probability  that  the  random  variable  X  assumes  a  value  on  the  interval 
from  a  to  b.  EXPF  is  given  by  the  integral  of  the  probability  density  f(x), 

b 

EXPF(X)  =  f  f ( X )  dx  (2-92) 

a 

If  the  chi-square  test  is  expressed  in  the  form  of  equation  (2-91)  then  the  cor¬ 
rection  for  serial  dependency  can  easily  be  made  by  substituting  N'  from  equation 
(2-90)  for  N.  The  relative  frequencies  computed  crom  the  actual  sample  size  are 
converted  to  adjusted  counts  by  performing  this  substitution. 

If  tested  data  actually  come  from  a  normal  disti ibution,  the  calculated  val¬ 
ues  for  chi-square  follows  approximately  the  theoretical  chi* - 4. are  distribution 
with  J-l  degrees  of  freedom.  If  these  data  comes  from  some  other  distribution, 
the  observed  frequencies  will  agree  poorly  with  the  expected  frequencies  and  the 
computed  chi-square  value  becomes  large.  Large  values  of  chi-square  cause  rejec¬ 
tion  of  the  null  hypothesis.  The  chi-square  test  is  a  nonspecific  test,  in  that 
the  test  criterian  is  not  directed  against  a  particular  type  of  departure  from 
normality.  Skewness  and  kurtosis  were  chosen  as  supplements  to  the  chi-square 
test. 

A  measure  of  the  amount  of  skewness  in  a  sample  is  given  by  the  equation: 

Vb  *  m  /  s3  (2-93) 

1  3 

where  Vbi  is  an  estimate  of  the  coefficient  of  skewness,  s  is  the  standard  devia¬ 
tion  of  the  sample,  and  m3  is  the  third  moment  about  the  mean,  or 

N 

m3  =  _1__  I  (X  -  X)3  (2-94) 

N  i=l 

In  equation  (2-93)  the  use  of  s3  renders  the  measurement  independent  of  the  scale 
of  these  data. 

In  the  test  for  skewness,  if  the  values  of  X  are  bunched  close  to  the  mean 
(X)  but  high  values  extend  far  above  the  mean,  Vb4  will  be  positive.  The  large 
positive  contributions  of  (X  -  X)3  when  X  is  greater  than  X,  will  predominate 
over  the  smaller  negative  contributions,  when  X  is  less  than  X.  Negative  values 
for  the  coefficient  of  skewness  are  encountered  when  the  sample  population  is 
extended  toward  the  lower  tail. 
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A  further  type  of  departure  from  normality  is  called  kurtosis. 
tion,  a  measure  of  kurtosis  is  given  by  the  equation: 


In  a  popula- 


m4  /  s4 


(2-95) 


where  b2  is  an  estimate  for  the  amount  of  kurtosis,  s  is  the  standard  deviation 
of  the  sample,  and  m4  is  the  fourth  moment  about  the  mean, 


m4  =  _1_  2  (X  -  X)4 
N  i=l 


(2-96) 


For  the  hypothetical  normal  distribution,  b2  has  the  value  3.  If  this  value  ex¬ 
ceeds  3,  there  is  usually  an  excess  of  values  near  the  mean  and  far  from  it,  with 
a  corresponding  depletion  of  the  flanks  of  the  distribution  curve.  Values  less 
than  3  result  from  curves  that  have  a  flatter  top  than  the  normal. 


Table  2-3  shows  the  results  of  these  tests  for  normality  of  the  ENDs  input  to 
the  inverse  transnormalization  process.  The  computed  values  for  the  mean,  vari¬ 
ance,  skewness,  kurtosis,  and  chi-square  are  given  for  each  test.  The  results  of 
the  tests  are  summarized  as  follows. 


In  all  cases,  the  calculated  chi-square  values  were  less  than  the  critical 
chi-square  value,  and  the  calculated  means  and  standard  deviations  were  within 
the  95-percent  confidence  interval  of  the  hypothesized  distribution.  In  all  but 
one  case,  the  departure  from  zero  of  the  calculated  coefficient  of  skewness  was 
less  than  the  standard  error  for  this  measure.  The  hypothesis  that  the  samples 
are  symmetrical  about  the  mean  cannot  be  rejected.  Finally,  in  all  but  one  case, 
the  departure  from  three  of  the  calculated  values  for  kurtosis  were  less  than  the 
standard  error  for  this  measure.  The  amount  of  kurtosis  in  the  samples  appears 
to  be  trivial.  Based  on  these  results  you  cannot  reject  the  null  hypothesis  that 
the  results  of  the  0-U  process  are  variables  from  a  population  distributed 
N(0,1). 


2. 3. 2. 3  Multivariate  Normal  Distribution  of  Ceiling  and  Visibility.  Another  key 
assumption  in  the  model  is  that  the  joint  distributions  of  ceiling  and  visibility 
in  space  and  time,  represent  an  underlying  multivariate  normal  distribution.  If 
ENDs  of  ceiling  and  visibility  were  distributed  exactly  according  to  a  multivari¬ 
ate  normal  distribution,  and  if  no  bias  were  introduced  by  the  method  used  to 
observe  and  record  the  weather,  then  WEASIM  could  produce  synthetic  joint  distri¬ 
butions  differing  from  observed  distributions  by  no  more  than  sampling  error. 


In  practice,  weather  data  contain  biases  and  inaccuracies  that  are  at  least 
as  bad  as  assuming  the  ENDs  of  ceiling  and  visibility  are  multivariate  normal. 
Three  sources  of  error:  1)  observing/recording  practices,  2)  nonmultivariate 
normality,  and  3)  sampling  error  complicate  the  process  of  "tuning"  WEASIM. 
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Table  2-3.  Normality  Tests  of  the  Generated  ENDS  for  the  Inverse 
Transnormalization  Process .  The  hypothesis  tested  was  that  the 
generated  ENDS  are  N(0,1). 


I 


#  OF 
SAWTOOTH 


WAVES 

VARIABLE 

MEAN 

VARIANCE 

SKEWNESS 

KURTOSIS 

X2 

3 

CIG 

-0.0195 

0.9813 

-0.0041 

2.8716 

9.26 

VSBY 

-0.0205 

1.0059 

0.0310 

2.8458 

9.17 

6 

CIG 

-0.0586 

1.0341 

0.^449 

2.8746 

12.39 

VSBY 

-0.0901 

0.9517 

-0.0664 

2.9987 

15.23 

9 

CIG 

0.0066 

0.9772 

-0.0486 

2.9791 

7.66 

VSBY 

0.0159 

0.9645 

-0.021J 

2.8283 

8.21 

12 

CIG 

-0.0335 

1.0230 

0.0819 

2.8680 

11.53 

VSBY 

0.0329 

.0.9807 

-0.0040 

3.3700 

10.85 

15 

CIG 

-0 . 0440 

1.0064 

0.0371 

2.9076 

15.66 

VSBY 

-0.0278 

0.9571 

-0.0632 

3.0857 

9.27 

18 

CIG 

-0.0611 

1.0214 

0.0471 

3.0744 

12.45 

VSBY 

-0.0639 

0.9778 

0.0172 

3.0026 

11.46 

21 

CIG 

0.0028 

1.0189 

0.0784 

2.8602 

12.10 

VSBY 

0.0279 

0 . 9474 

-0.1053 

2.9479 

10.38 

ACTUAL  SAMPLE  SIZE  =  N  =  2500 
EFFECTIVE  SAMPLE  SIZE  =  N'=  785 

CRITICAL  VALUES:  MEAN  =0.0 

SKEWNESS  =0.0 

x  219, 0 . 05  =  30  14 


SD:  SKEWNESS  =  ,/6/N1  =  0.087 
KURTOSIS  =  ,/24/N'  =  0.175 

VARIANCE  =1.0 
KURTOSIS  =3.0 
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2.3.3  Assumptions  Dealing  With  Correlation.  These  paragraphs  list  the  assump¬ 
tions  made  in  describing  the  correlation  structure  of  the  model. 


i 
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2. 3. 3.1  Isotropic  Spatial  Correlation.  The  sawtooth  wave  submodel  produces  a 
random  normal  number  field  with  a  horizontally  isotropic  spatial  correlation 
function.  In  such  a  function,  spatial  correlation  is  a  function  of  distance 
alone,  and  does  not  depend  on  direction,  i.e.. 


p  =  f( distance)  not  f( distance,  direction) 


One  might  assume  at  first  that  the  spatial  correlation  of  meteorological  vari¬ 
ables  is  directionally  dependent.  In  two  separate  studies  (Martin  and  Myers, 
1978,  Whiton,  et  al,  1981)  this  was  not  the  case.  The  results  of  these  two 
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studies  indicate  that  the  isotropic  assumption  is  quite  acceptable  for  ceiling, 
visibility,  and  total  cloud  cover. 


2. 3. 3. 2  Homogeneous  Spatial  Correlation.  The  mathematical  nature  of  the  saw¬ 
tooth  wave  submodel  does  not  allow  for  any  variability  of  the  spatial  correlation 
decay  function  over  different  portions  of  the  random  normal  number  field.  This 
does  not  present  a  problem  if  the  simulation  covers  a  limited  area,  say  Southern 
Germany.  As  tables  3-6  and  3-7  illustrate,  ceiling  and  visibility  do  not  display 
homogeneous  spatial  correlation  decay  functions  over  larger  areas,  such  as  the 
European  continent.  If  WEASIM  is  to  produce  synthetic  observations  for  locations 
over  a  very  large  area,  then  the  sawtooth  wavelengths  should  be  chosen  to  be 
representative  of  the  subarea  of  greatest  interest. 

2. 3. 3. 3  Homogeneous  Serial  Correlation.  WEASIM  is  configured  to  allow  only  one 
exponential  decay  constant  to  be  specified  for  ceiling  and  one  for  visibility. 
This  restriction  was  imposed  to  cut  down  on  the  computer  storage  requirements  of 
the  model.  In  actuality,  separate  constants  for  each  location  could  be  speci¬ 
fied.  Tests  conducted  by  USAFETAC  indicate  that  although  the  serial  correlation 
decay  function  of  ceiling  and  visibility  does  vary  from  location  to  location, 
this  variability  is  relatively  small.  Little  advantage  is  gained  by  increasing 
the  storage  requirements  of  the  model  to  specify  multiple  exponential  decay 
constants . 

2.3.4  Model  Limitations.  In  the  next  paragraphs  the  limitations  of  the  model 
are  discussed. 

WEASIM  cannot  produce  "forecasts."  Simulation  models  such  as  HQ  USAF's 
WARRIOR,  in  which  WEASIM  must  reside,  have  to  approximate  the  major  actions  and 
decisions  made  in  operational  practice.  Some  of  these  operator  decisions  are 
based  on  climatology,  some  on  forecasts,  and  some  on  observations.  Future  re¬ 
quests  from  HQ  USAF  may  include  a  requirement  for  "forecast"  ceiling  and  visibil¬ 
ity.  A  decision  to  include  a  forecast  module  within  WEASIM  would  of  course 
increase  computer  run  time  and  space  requirements  of  the  model . 

Some  important  limitations  result  from  the  manner  in  which  WEASIM  treats 
clouds.  First,  the  Ceiling/Visibility  Simulation  Model  does  not  consider  cloud 
layers.  The  ceiling  output  for  each  location  represents  the  lowest  level  at 
which  more  than  50  percent  of  the  celestial  dome  is  covered  with  opaque  clouds. 
WEASIM  is  not  meant  to  be  a  three-dimensional  cloud  simulation  that  gives  layer¬ 
ing  information.  Second,  this  model  does  not  consider  clear  or  scattered  condi¬ 
tions  (no  ceiling).  WEASIM  always  returns  some  value  for  ceiling  at  each 
location.  The  model  represents  the  time  that  the  sky  cover  is  less  than  50 
percent  as  very  high  ceiling  values  (CIG  >  40,000  ft).  Whereas,  in  nature  the 
value  for  total  cloud  cover  can  approach  zero,  in  this  model  the  cloud  cover  is 
maintained  at  greater  than  50  percent  but  the  ceiling  height  approaches  infinity. 
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Obviously,  the  manner  in  which  clouds  are  treated  renders  the  model  inappropriate 
for  certain  cloud-free  line-of-sight  applications. 

WEASIM  can  only  produce  synthetic  observations  for  locations  for  which  we 
have  recorded  weather  data.  Only  then,  can  ceiling  and  visibility  modeling 
coefficients  be  obtained  from  the  regression  technique  covered  in  chapter  3.  If 
modeling  coefficients  are  desired  for  locations  other  than  reporting  stations 
then  various  methods  can  be  employed  to  spread  the  climatology  to  these  intermed¬ 
iate  points.  In  a  data  rich  area  like  Northern  Europe,  USAFETAC  recommends  the 
nearest  neighbor  approach.  That  is,  modeling  coej.iicients  for  the  nearest  re¬ 
porting  location  are  chosen  to  represent  the  desired  point.  In  an  area  like  West 
Germany,  the  distance  to  the  nearest  reporting  location  is  usually  less  than  25 
kilometers.  In  data  sparse  areas  like  Africa,  the  nearest  neighbor  approach  can¬ 
not  be  used  with  any  degree  of  success. 

Finally,  WEASIM  is  configured  to  produce  synthetic  observations  for  a  maximum 
of  200  locations.  This  number  can  be  increased  or  decreased  by  adjusting  the 
array  sizes  within  the  various  COMMON  blocks. 
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Chapter  3 


MODEL  INPUT  AND  OUTPUT  DATA 


3.1  General  Information 


The  weather  simulation  model  contained  within  the  subroutine  WEASIM  was  de¬ 
signed  to  generate  observations  of  ceiling  and  visibility  that  are  consistent  in 
time  and  spatial  extent,  and  representative  of  the  geographic  area  of  the  simula¬ 
tion.  The  term  consistent  means  the  model  must  preserve  the  temporal  (serial) 
and  spatial  correlation  of  ceiling  and  visibility  lz  is  usually  observed  in 
nature;  representative  means  the  simulation  should  return  probability  distribu¬ 
tions  of  ceiling  and  visibility  for  the  modeled  locations  th  t  are  similar  to 
distributions  derived  from  observed  data. 

To  "tune"  the  simulation  model  to  a  specific  geographic  area  a  comprehensive 
study  of  the  past  weather  of  the  area  must  be  conducted  to  obtain  underlying 
probability  distributions  and  correlation  parameters  for  the  model  to  use.  This 
chapter  describes  how  such  a  study  was  conducted  for  a  European  set  of  modeling 
coefficients  and  parameters.  A  potential  user  of  WEASIM  should  keep  in  mind  that 
the  same  type  of  procedure  can  be  applied  to  any  portion  of  the  world  for  which 
historical  weather  data  exists. 

USAFETAC  in  its  role  as  the  Air  Force's  applied  climatology  center  is  in  the 
best  position  to  perform  this  type  of  data  study.  Part  of  USAFETAC' s  mission  is 
archival  of  all  weather  observations  collected  by  the  worldwide  USAF  automated 
weather  network  (AWN).  These  observations  are  from  all  countries  (including 
Warsaw  Pact  countries)  that  belong  to  the  World  Meterological  Organization  (WMO) 
and  exchange  weather  data  through  international  agreement.  This  weather  data  is 
decoded  at  the  Air  Force  Global  Weather  Central  ( AFGWC ) ,  Offutt  AFB,  NE  and  is 
stored  on  magnetic  tape  at  USAFETAC,  Scott  AFB,  IL  and  USAFETAC,  OL-A,  Asheville, 
NC.  The  format  in  which  data  are  kept  on  tape  is  referred  to  as  DATSAV.  A  short 
description  of  data  collection  procedure  follows: 

Each  of  the  overseas  AWN  sites  (Fuchu  Air  Station,  Japan;  Clark  AB,  Philip¬ 
pine  Islands;  and  RAF  Croughton,  England)  collects  data  disseminated  by  various 
countries  over  radio  teletype  transmission  and  continuous  wave  broadcast.  These 
observations  are  then  transmitted  via  high-speed  circuits  to  the  AWN  site  at 
Carswell  AFB,  Texas  (Det  7,  AFGWC).  All  other  data  collected  by  the  Carswell  AWN 
site  are  added  to  the  overseas-collected  data  and  transmitted  again  to  AFGWC  at 
Offutt  AFB,  NE,  where  it  is  further  processed. 

Each  specific  type  of  data  is  decoded  by  appropriate  computer  programs  de¬ 
signed  to  process  various  types  of  weather  codes.  The  decoding  process  consists 
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of  identifying  each  part  of  the  observation  and  placing  the  .  specific  element 
values  into  a  predetermined  disk-storage  location.  Daily,  the  disk-storage  areas 
are  output  to  tape  for  AFGWC  use  and  subsequent  transmission  to  USAFETAC.  USAF- 
ETAC  reformats  each  observation  into  DATSAV  format  and  eventually  creates  a  POR 
tape  of  individual  stations  for  analysts'  use. 

Data  quality  ranges  from  very  good  for  the  Western  industrialized  countries 
to  questionable  for  the  poorer  less  advanced  countries.  In  the  west  the  individ¬ 
ual  countries  attempt  to  follow  published  WMO  procedures  for  observing  and  re¬ 
porting  ceiling  and  visibility  as  part  of  their  weather  observations.  In  the 
instances  where  a  country  deviates  from  published  procedures  it  usually  publishes 
a  bulletin  listing  national  practices  as  an  addendum  to  WMO  publications.  The 
quality  of  the  data  of  the  less  industrialized  countries  and  of  the  Warsaw  Pact 
countries  is  markedly  poorer.  Data  coverage  is  not  as  great  and  not  all  the 
observations  make  it  to  the  AWN  circuits.  This  is  especially  true  of  weather 
observations  at  Warsaw  Pact  military  airfields.  These  problems  will  be  discussed 
in  depth  in  later  paragraphs  of  this  chapter. 

3.2  Input  Data 

Data  that  must  be  input  to  the  model  consists  of  modeling  coefficients  for 
each  location  to  be  simulated,  values  for  the  sines  and  cosines  of  the  latitude 
and  longitude  of  each  location,  correlation  parameters  that  will  dictate  the  cor¬ 
relation  functions  to  be  reproduced  by  the  model,  and  finally  initial  ceiling  and 
visibility  conditions  at  each  location.  These  model  inputs  will  now  be  described 
in  detail. 

3.2.1  Modeling  Coefficients  for  Visibility.  It  might  be  useful  to  review  the 
inverse  transnormalization  process  here.  The  mathematics  of  WEASIM  are  based  on 
multivariate  normal  theory  and  the  model  works  in  normal  space.  All  correlations 
are  based  on  computations  with  ENDs  and  not  the  actual  values  of  ceiling  and 
visibility.  Inverse  transnormalization  provides  a  method  by  which  the  generated 
fields  of  ENDs  may  be  converted  to  raw  values  of  ceiling  and  visibility  (see 
figure  3-1). 

The  model  generates  ENDs,  and  by  integration  of  the  normal  density  function, 
these  ENDs  are  converted  to  values  of  the  cumulative  distribution  function.  The 
resultant  CDFs  are  then  used  in  the  modeling  equations  to  compute  raw  values  for 
ceiling  and  visibility.  The  local  climatology  of  an  area  comes  into  play  in  this 
transnormalizing  process  in  the  following  manner. 

A  typical  POR  tape  in  DATSAV  format  for  a  given  weather  station  contains  all 
the  surface  weather  observations  for  that  location  in  the  last  5,  10,  15,  etc. 
years  (depending  on  the  length  of  the  POR).  These  observations  can  then  be  sorted 
by  month  and  time  of  day  to  produce  CDFs  for  specific  periods.  For  WEASIM  the 
cumulative  probabilities  that  the  visibility  or  ceiling  are  less  than  selected 
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Figure  3-1.  Schematic  of  the  Inverse  Transnormalization  Process. 


thresholds  are  created  from  the  historical  record  for  each  month  for  eight  time 
periods  (23-01,  02-04,  05-07,  . .  20-22  GMT  ).  These  time  periods  were  selec¬ 
ted  to  coincide  with  the  mid-period  hours  of  00,  03,  06,  ..,  21  GMT,  respective¬ 
ly.  NOTE:  Not  all  weather  stations  take  hourly  observations.  Some  take 
observations  every  three  hours  at  the  times  mentioned  above,  so  an  attempt  was 
made  to  center  the  time  intervals  on  the  hours  of  maximum  observation  counts. 
The  historical  CDFs  are  then  regressed  on  the  theoretical  distributions  to  obtain 
the  modeling  coefficients.  This  distribution  fitting  procedure  is  the  key  to 
providing  geographic  realism  to  the  model.  The  diurnal  and  seasonal  variations 
of  ceiling  and  visibility  at  each  location  are  incorporated  in  the  variability  of 
these  modeling  coefficients.  The  regression  procedure  shall  now  be  discussed  in 
detail . 

3. 2. 1.1  Basic  Modeling  Equation.  USAFETAC's  basic  modeling  equation  for  visi¬ 
bility  is  the  Weibull  curve.  This  function  was  first  applied  to  fitting  visibil¬ 
ity  distributions  by  Somerville,  Bean,  and  Falls  (1979).  The  Weibull  curve  is 
given  by  the  equation. 


P  =  1  -  exp(-uxTP)  (3-1) 

where  a  and  0  are  modeling  coefficients  determined  from  empirical  distributions, 
xT  is  some  threshold  visibility  in  statute  miles,  and  P  is  the  probability  that 
the  actual  visibility  (X)  is  less  than  or  equal  to  xT- 


3. 2. 1.2  Inverse  Transnormalizing  Equation.  Values  of  visibility  can  be  obtained 
from  given  probabilities  by  solving  equation  (3-1)  for  x_, 


[3. 2. 1.3  Obtaining  Modeling  Coefficients.  In  most  simulation  applications  using 
the  Weibull  distribution,  the  standard  method  for  estimating  a  and  p  from  a  given 
[  sample  is  by  an  iterative  solution  of  the  maximum  likelihood  equations.  USAFETAC 

uses  a  different  approach.  To  obtain  the  modeling  coefficients  a  and  p,  the  val¬ 
ues  for  an  empirical  cumulative  distribution  are  regressed  on  the  Weibull  cumula¬ 
tive  distribution  function.  The  resulting  coefficients  are  those  which  minimize 
“  the  sum  of  the  squares  of  the  differences  between  the  observed  and  modeled 

•  (Weibull)  cumulative  distributions.  This  has  great  appeal  for  this  particular 

j  application;  the  object  is  to  reproduce  the  observed  probability  distributions  as 

I  closely  as  possible  and  not  estimate  a  and  p  for  their  own  sake.  USAFETAC  uses  a 

|  linear  regression  scheme  developed  by  Maj  A1  Boehm,  USAFETAC/DNM,  which  is: 

I 

Let  Q  =  1  -  P  (the  probability  that  X  is  greater  than  xT)  and  substitute  this 
value  into  equation  (3-1): 

I  Q  =  exp(-axTP)  (3-3) 

Take  the  natural  logarithm  of  each  side  of  equation  (3-3),  which  yields. 

In  Q  =  -axTP  (3-4) 

equation  (3-4)  can  be  rewritten  as, 

-In  Q  =  axTP  (3-5) 

Equation  (3-5)  is  in  the  form  of  a  regular  power  function,  which  can  be  linear¬ 
ized  by  taking  the  natural  logarithm  of  each  side. 

ln[-ln  Q]  =  In  o  +  ^  In  ^  (3-6) 

For  a  power  function  fit  by  the  method  of  least  squares,  the  estimates  of  a  and  0 
are  obtained  by  fitting  a  straight  line  to  the  set  of  ordered  pairs  In  X^, 
ln[-ln  Q] .  Substituting  these  values  into  the  normal  equations  for  a  straight 
line,  the  solution  for  p  becomes: 

n2(ln  xT)(ln[-ln  Q] )  -  (21nx_) (iln[ -In  Q) ) 

P  =  - I - i -  (3-7) 

nIxT2  -  (ilnxT)2 

The  solution  for  a  becomes, 

a  =  exp( Ini -In  QJ  -  p  5^)  (3-8) 

NOTE:  For  the  linearization  technique  to  be  successful,  0  <  Q  <  1.  All  ordered 
pairs  where  Q  is  equal  to  1  or  0  should  not  be  used  in  the  regression,  because 
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ln[-ln  Q]  will  be  undefined.  However,  this  simple  linear  regression  minimized 
the  root  mean  squared  error  in  ln[ln  Q]  space  and  was  not  necessarily  the  case 
when  translated  to  Q  space.  To  approximate  minimum  errors  in  Q  space,  it  was 
necessary  to  apply  a  weighting  factor  to  each  data  point.  The  resultant  equation 
for  calculating  the  modeling  coefficients  became: 


p  =  IWF  I (In  xT)(ln[-ln  Q])(WF)  -  <Z  WF  In  xT)(Z  WF  ln[-ln  Q] ) 

(IWF  Z  WF  In  xT^)  -  (Z  WF  In  xT)z 


a  =  exp  (  S-g^rkLg] 


where 


Z  WF  In 
fWF 


WF  =  (Q  In  Q)* 


*2  ) 


(3-9) 

(3-10) 


(3-11) 


3. 2. 1.4  Regression  Example.  To  see  how  this  technique  works  in  practice 
consider  the  example  given  in  table  3-1. 


Table  3-1.  Empirical  and  Modeled  Cumulative  Probabilities  that 
the  Visibility  was  Less  than  Selected  Threshold  Values  for  Sembach 


• 

* 

► 

AB,  FRG,  January, 

2300-0100  GMT. 

h 

c 

V 

Threshold 

a 

Visibility 

Observed 

Modeled 

i 

w 

xt(SM) 

Pr  (  X  S  Xj  ) 

Pr  (  X  S  Xj  ) 

Residual 

■ 

4 

• 

1 

0.025 

0.000 

0.004 

-0 . 004 

: 

0.313 

0.004 

0.006 

-0.002 

■i 

0.500 

0.011 

0.013 

-0.002 

■ 

0.625 

0.018 

0.019 

-0.001 

1 

0.750 

0.029 

0.025 

0.004 

1 

1.000 

0.040 

0.040 

0.000 

'4 

1.2S0 

0.061 

0.058 

0.003 

1.500 

0.068 

0.078 

-0.010 

2.000 

0.086 

0.123 

-0.037 

. 

2.500 

0.189 

0.174 

0.015 

•* 

3.000 

0.235 

0.229 

0.006 

4.000 

0.339 

0.343 

-0.004 

1 

5.000 

0.467 

0.458 

0.009 

P 

6.000 

0.556 

0.564 

-0.008 

* 

* 

Table  3-2  gives  the  various  sums,  sums  of  squares,  and  sums  of  cross  products 
needed  for  the  calculations  of  the  modeling  coefficients.  The  calculated  values 
for  ZWF,  ZWF-lnxT,  IWF’lnxT2,  IWF* ln(-lnQ),  and  ZWF*lnxT*ln(-lnQ)  are  substituted 
into  equation  (3-9)  to  compute  0: 


/>.*. 
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Table  3-2.  Summation  Terms  Required  in  the  Regression  of  Visibility  Data  for  Sembach  AB,  FRG,  January, 
2300-0100  GMT. 


r~  «ru. 


r*jTiw^r;a 


r  wamiHiu-  mumi  wwwm? 


(0.40554748) (-0.34241127)  -  (0. 59700708 )(-0. 29271346) 
(0.40554748 )(0. 93175941)  -  (0.59700708)* 

=  1.6726915 


p  is  in  turn  used  in  equation  (3-10),  with  IWF,  IWF* ln(-lnQ) ,  and  IWF-lnx^,,  to 
compute  a : 


a 


exp  l 


-0.29271346 

0.40554748 


0.041413631 


( 1 .6726915 ) ( 


0.59700708 

0.40554748 


)  ] 


The  resultant  coefficients  can  then  be  substituted  into  equation  (3-1)  to  obtain 
the  modeled  probabilities  for  given  thresholds  (see  table  3-1  for  the  modeled  CDF 
and  the  residuals  between  the  observed  and  modeled  curves).  In  the  above  example 
the  Weibull  curve  provided  an  excellent  fit  to  the  observed  distribution.  The 
RMS  of  the  fit  was  0.012  and  the  maximum  difference  between  the  curves  was  only 
0.037.  Figure  3-2  illustrates  this  curve  fitting  example.  Remember,  the  fitted 
curve  is  the  Weibull  function  for  which  the  sum  of  the  squares  of  the  differences 
between  the  fitted  CDF  and  the  empirical  CDF  were  minimized.  The  RMS  differences 
are  always  taken  at  the  threshold  values  used  in  the  actual  curve  fit. 


Figure  3-2.  Observed  and  Modeled  CDFs  for  Visibility 
at  Sembach  AB,  FRG,  January,  2300-0100  GMT. 
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In  WEASIM  the  visibility  coefficients  are  stored  in  the  labelled  COMMON  block 
/COEF/ .  The  a '  s  are  stored  in  the  array  ALPHA  and  the  p '  s  are  stored  in  the 
array  BETA.  The  following  indexing  convention  is  followed  for  each  coefficient: 


ALPHA (  NLOC,  NT  ) 

i 

Location  Index 
1  i  NLOC  S  200 


Time  Period  Index 

1  a  23-01  GMT 

2  »  02-04  GMT 


8  a  20-22  GMT 


3.2.2  Modeling  Coefficients  for  Ceiling.  The  empirical  cumulative  distributions 
for  ceiling  are  created  in  the  same  manner  as  that  mentioned  for  visibility  in 
paragraph  3.1.2. 


3. 2. 2.1  Basic  Modeling  Equation.  USAFETAC  uses  a  slightly  different  form  of  the 
Weibull  curve,  called  the  "Reverse  Weibull,"  to  fit  ceiling  data  than  it  does  for 
visibility  data.  This  form  was  developed  by  Lt  Col  Pershing  Hicks,  USAFETAC/DNY 
(Hicks,  1982).  The  reverse  Weibull  curve  for  ceiling  data  is  given  by  the 
equation, 

P  =  exp(-axTP)  (3-12) 

where  a  and  8  are  modeling  coefficients  determined  from  empirical  distributions, 
xT  is  some  threshold  ceiling  in  feet,  and  P  is  the  probability  that  the  actual 
ceiling  (X)  is  less  than  or  equal  to  x^. 

3. 2. 2. 2  Inverse  Transnormalizing  Equation.  Values  of  ceiling  can  be  obtained 
from  given  probabilities  by  solving  equation  (3-12)  for  x^,, 

xT  =  [  )1/P  (3-13) 

3. 2. 2. 3  Obtaining  the  Modeling  Coefficients.  To  obtain  the  modeling  coeffici¬ 
ents  a  and  p,  the  weighted  linear  regression  technique  developed  by  Hicks  and 
Boehm  of  USAFETAC  (mentioned  in  the  paragraph  3. 2. 1.3)  is  used.  Values  of  a  and 
p  are  obtained  by  fitting  a  straight  line  to  the  set  of  ordered  pairs  of  data 
In  xT,  ln(-ln  P]  with  the  following  equations: 


p  =  2WF  2 ( In  xT)(ln[-ln  P])(WF)  -  (2  WF  In  xT)(I  WF  ln[-ln  P) ) 
(2WF  I  WF  In  xT2)  -  (I  WF  In  x^T2 


a 


exp  ( 


2  WF  Inf -In  P] 
2  WF 


-  P 


2  WF  In 


*T  ) 


2  WF 


(3-14) 

(3-15) 


3. 2. 2. 4  Regression  Example.  An  example  for  the  regression  of  ceiling  data  on 
the  Reverse  Weibull  curve  is  presented  in  table  3-3. 


Table  3-3.  Empirical  and  Modeled  Cumulative  Probabilities  that 
the  Ceiling  was  Less  than  Selected  Threshold  Values  for  Berlin/ 
Schonefeld,  GDR,  July,  1700-1900  GMT . 


Threshold 

Ceiling 

Observed 

Modeled 

xT(ft) 

Pr  (  X  S  xT  ) 

Pr  (  X  *  xT  ) 

Residual 

100 

0.000 

0.000 

0.000 

200 

0.000 

0.000 

0.000 

300 

0.000 

0.000 

0.000 

400 

0.000 

0,001 

-0.001 

500 

0.000 

0.002 

-0.002 

600 

0.003 

0.003 

0.000 

700 

0.007 

0.005 

0.002 

800 

0.010 

0.007 

0.003 

900 

0.012 

0.009 

0.003 

1000 

0.014 

0.012 

0.002 

1200 

0.023 

0.018 

0.005 

1500 

0.028 

0.029 

-0.001 

1800 

0.044 

0.041 

0.003 

2000 

0.051 

0.049 

0.002 

2500 

0.062 

0.069 

-0.007 

3000 

0.069 

0.089 

-0.020 

3500 

0.099 

0.109 

-0.010 

4000 

0.117 

0.127 

-0.010 

4500 

0.153 

0.145 

0.008 

5000 

0.165 

0.162 

0.003 

6000 

0.199 

0.192 

0.007 

7000 

0.243 

0.220 

0.023 

8000 

0.249 

0.245 

0.004 

9000 

0.265 

0.268 

-0.003 

10000 

0.270 

0.288 

-0.018 

Table  3-4  gives  the  various  sums,  sums  of  squares,  and  sums  of  cross  products 
needed  for  the  calculations  of  the  modeling  coefficients.  The  calculated  values 
for  ZWF,  zWF-lnxT,  ZWF'lnXj2,  ZWF-ln(-lnP),  and  ZWF- lnx^* ln(-lnP)  are  substituted 
into  equation  (3-14)  to  compute  0: 


(1.0096402) (4. 5404001)  -  (8. 7135020)(0. 5431495) 
(1.0096402)(75. 4668761)  -  (8.7135020)2 
=  -0.55171818 


8  is  in  turn  used  in  equation  (3-15),  with  ZWF,  ZWF*  ln(-lnP),  and  ZWF'lrutp,  to 
compute  a: 
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The  resultant  coefficients  can  then  be  substituted  into  equation  (3-12)  to  obtain 
the  modeled  probabilities  for  given  thresholds  (See  table  3-3  for  the  modeled  CDF 
and  the  residuals  between  the  observed  and  modeled  curves).  In  the  above  example 
the  Reverse  Weibull  curve  provided  an  excellent  fit  to  the  observed  distribution. 
The  RMS  of  the  fit  was  0.008  and  the  maximum  difference  between  the  curves  was 
only  0.023.  Figure  3-3  illustrates  this  curve  fitting  example.  As  in  the  case 
of  visibility,  the  fitted  curve  is  the  Reverse  Weibull  function  for  which  the  sum 
of  the  squares  of  the  differences  between  the  fitted  CDF  and  the  empirical  CDF 
were  minimized. 


0.300  -a 


Figure  3-3.  Observed  and  Modeled  CDFs  for  Ceiling  at 
Berlin/Schonefeld,  GDR,  July,  1700-2000  GMT. 

The  coefficients  for  ceiling  are  stored  in  the  labelled  COMMON  block  /COEF/. 
The  a ' s  for  each  location  are  stored  in  the  array  A  and  the  p's  are  stored  in 
array  B.  The  same  indexing  convention  is  followed  for  the  ceiling  coefficients 
as  that  described  in  paragraph  3. 2. 1.4. 

3.2.3  Location  Coordinates.  A  critical  part  of  the  sawtooth  wave  submodel  pro¬ 
cess  is  the  need  to  compute  the  great  circle  distances  between  each  of  the  ran¬ 
domly  generated  wave  focal  points  and  the  locations  to  be  modeled.  To  calculate 
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these  distances  the  coordinates  of  the  focal  points  and  each  of  the  locations  to 
be  modeled  are  required  by  the  model  in  some  form.  In  paragraph  2.2.9  the  equa¬ 
tion  that  WEASIM  uses  to  calculate  the  great  circle  distances  was  presented 
(equation  2-66).  In  this  equation  the  actual  latitude  and  longitude  of  the  two 
points  are  not  used.  Rather,  the  sines  and  cosines  of  the  coordinates. 

The  decision  had  to  be  made  during  the  design  phase  of  the  model,  whether  to 
store  the  location  position  variables  as  latitude/longitude  or  as  trigonometric 
equivalents  of  the  coordinates.  It  was  felt  that  the  better  choice  was  to  store 
the  location  variables  as  trigonometric  functions. 

It  is  computationally  more  efficient  to  compute  the  trigonometric  functions 
once  (during  the  initialization  phase  of  the  model)  and  store  these  functions  in 
memory  rather  than  recompute  them  during  each  replication  of  the  model.  The 
model  designer  felt  that  the  savings  in  computer  run  time  more  than  offset  the 
increase  of  storage  required  by  WEASIM  (4  words  of  memory  vs.  2  for  each  loca¬ 
tion).  The  model  is  presently  configured  for  200  locations  so  this  option 
increased  core  storage  by  400  32-bit  words.  If  core  storage  is  of  more  concern 
to  a  potential  user  then  model  run  time,  then  WEASIM  could  be  modified  to  store 
actual  geographic  coordinates  rather  then  the  trigonometric  equivalents. 

Paragraph  2.2.10.1  of  this  technical  note  covers;  the  process  by  which  the 
coordinates  of  each  of  the  wave  focal  points  are  randomly  generated.  The  coordi¬ 
nates  of  the  individual  locations  being  modeled  are  read  into  WEASIM  during  ini¬ 
tialization.  USAFETAC  has  provided  an  initialization  routine  called  WEAINT  to  HQ 
USAF/SA  for  the  Ceiling/Visibility  Simulation  Model.  This  routine  is  not 
explicitly  covered  by  this  technical  note  but  the  procedure  used  to  compute  the 
necessary  positional  variables  is  summarized  as  follows. 

The  latitude  and  longitude  of  each  location  being  modeled  should  be  stored  in 
the  input  coefficient  file  as  degrees  in  decimal  form  DDD.dd  (rather  than 
degrees-minutes-seconds ) .  Storing  the  geographic  coordinates  in  this  form  rather 
than  trigonometric  equivalents  directly  allows  a  user  to  identify  easily  the 
location  for  which  the  coefficients  apply.  It  does  not  matter  what  convention 
for  signs  (+/-)  is  followed,  although  for  the  purposes  of  this  technical  note  it 
will  be  assumed  that  north  latitude  will  be  positive  (+),  south  latitude  will  be 
negative  (-),  east  longitude  will  be  positive,  and  west  longitude  will  be  nega¬ 
tive.  The  latitude  and  longitude  for  each  location  in  the  input  coefficient  file 
are  first  read  into  temporary  storage  and  then  processed  in  the  following  manner. 

3. 2. 3.1  Latitude.  Latitude  in  degrees  is  converted  directly  to  radians  by 
applying  the  equation: 


LAT  (RAD)  =  LAT  (DEG)  •  0.01745329 


(3-17) 


where  0.01745329  is  the  degrees  to  radians  conversion  factor.  The  sine  and 
cosine  of  the  latitude  is  then  computed  by  invoicing  the  American  National 
Standard  (ANS)  FORTRAN  library  functions  SIN  and  COS  with  the  following  calls. 


E, 


SINLAT  =  SIN  (  LAT(RAD)  ) 


and. 


COSLAT  -  COS  (  LAT(RAD)  ) 

These  trigonometric  functions  are  then  stored  in  the  labelled  COMMON  block 
/GRIDCM/. 

3. 2. 3. 2  Longitude.  Longitude  is  handled  differently.  Since  tne  computation  of 
GCD  is  based  on  spherical  trigonometry  the  longitude  must  first  be  converted  to  a 
360°  reference  coordinate  rather  then  the  normal  -180°  to  +180°.  After  this  con¬ 
version  the  degrees  to  radians  correction  factor  can  then  be  applied. 

If  the  location  is  in  the  Eastern  Hemisphere  (  LNG  (DEG)  >0.  )  then  equation 
(3-18)  applies: 


LNG  (RAD)  =  LNG  (DEG)  •  0.01745329  (3-18) 

If  the  location  is  in  the  Western  Hemisphere  (  LNG  (DEG)  <0.  )  then: 

LNG  (RAD)  *  (360.  -  |  LNG  (DEG)  |  )  •  0.01745329  (3-19) 

The  sine  and  cosine  of  the  longitude  are  then  computed  by  once  again  invoking  the 
ANS  FORTRAN  library  functions  SIN  and  COS  with  the  following  calls: 

SINLNG  -  SIN  (  LNG  (RAD)  ) 


and. 


COS LNG  =  COS  (  LNG  (RAD)  ) 

These  trigonometric  functions  are  then  stored  in  the  labelled  COMMON  block 
/GRIDCM/. 

3.2.4  Cross-variable  correlation  Coefficient.  To  estimate  the  cross-variable 
correlation  of  ceiling  and  visibility  a  study  of  57  European  locations  was 
conducted. 


Within  the  Air  Weather  Service  Technical  Library  (AWSTL)  there  exists  statis¬ 
tical  tabulations  of  historical  weather  data  for  hundreds  of  locations  worldwide. 
These  tabulations  are  called  RUSSWOs  and  N-Summaries.  An  important  part  of  these 
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tabulations  ara  probability  tablaa  listing  joint  oeeurranca  information  of  ceil* 
ing  vs.  visibility.  Thasa  probability  tablas  ara  constructad  by  month  and  tima 
of  day.  However,  thara  ara  tablas  for  aach  month  for  all  timas.  Thasa  monthly 
tabulations,  for  January  and  July,  vara  the  ones  usad  for  this  corralation  study. 
Tha  ceiling/visibility  cross-variable  corralation  coafficiant  may  ba  calculated 
easily  from  thasa  tablas  using  tha  tetrachoric  correlation  coefficient  described 
in  Appendix  A. 

Cross-variable  correlation  coefficients  were  calculated  for  57  different 
locations  across  Europe.  The  sample  stations  included  locations  from  England, 
France,  Denmark,  West  and  East  Germany,  Poland,  Finland,  Turkey,  and  the  Soviet 
Union.  An  effort  was  made  to  select  stations  exhibiting  a  wide  spectrum  of  geo¬ 
graphic  and  terrain  characteristics.  However,  to  emphasize  HQ  USAF/SA  areas  of 
higher  interest,  more  stations  were  selected  from  East  and  West  Germany  than  any 
other  area. 
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Figure  3-4.  Analysis  of  the  Distribution  of  Cross-variable 
Correlation  Cofficients  Computed  from  Observed  Data  for  57 
Locations  in  Europe  for  January. 


Figure  3-4  shows  the  overall  distribution  of  cross-variable  correlation  coef¬ 
ficients  for  all  stations  in  January;  figure  3-5  shows  the  distribution  in  July. 
The  average  correlation  coefficient  for  all  locations  in  January  was  0.520  and 
the  standard  deviation  was  0.195.  In  July  the  average  correlation  was  0.267  with 
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Figure  3-5.  Analysis  of  the  Distribution  of  Cross-variable 
Correlation  Cofficients  Computed  from  Observed  Data  for  57 
Locations  in  Europe  for  July. 


a  standard  deviation  of  0.207.  Meteorologically  this  is  what  one  might  expect. 
In  the  winter  the  weather  patterns  are  dominated  by  large  scale  systems  that  tend 
to  affect  both  ceiling  and  visibility  together.  In  the  summer,  when  the  weather 
is  more  affected  by  localized  showers,  fog,  haze,  etc.,  ceiling  and  visibility 
are  more  independent  (less  related).  Remember,  a  model  is  merely  a  simplifica¬ 
tion  of  the  more  complex  reality  it  is  trying  to  simulate,  so  the  goal  is  to  find 
a  single  "representative"  value  for  cross-variable  correlation. 


Table  3-5  gives  examples  of  cross-variable  correlation  coefficients  that  may 
be  used  in  the  model  depending  on  season  and  area  of  interest.  The  cross-vari¬ 
able  correlation  coefficient  is  stored  in  the  labelled  COMMON  block  /CORREL/ 
within  WEASIM. 

3.2.5  Sawtooth  Wavelengths.  To  estimate  the  spatial  correlation  decay  functions 
of  ceiling  and  visibility  in  Europe,  an  extensive  study  of  surface  weather  ob¬ 
servations  was  conducted.  USAFETAC,  OL-A,  prepared  tables  of  joint  occurrence  of 
ceiling  and  of  visibility,  for  34  station  pairs,  for  January  and  July.  The 
tables  were  based  on  a  nine  year  POR  from  1973  to  1961.  The  station  pairs  were 
selected  to  provide  values  of  calculated  spatial  correlation  coefficients  at  var¬ 
ious  distances.  From  these  coefficients  a  modeled  spatial  correlation  decay 
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Table  3-5.  Average  Cross-variable  Correlation  Coefficients 
Computed  from  Observed  Data  for  Various  Areas  of  Interest.  The 
numbers  in  parentheses  are  the  number  of  stations  used  in  the 
calculations  for  each  case. 


AREA 

JANUARY 

JULY 

Northern  Europe  Overall 
(57) 

.520 

.267 

East  and  West  Germany  Overall 
(30) 

.468 

.262 

East  and  West  Germany  North  of 
(13) 

51N 

.445 

.198 

East  and  West  Germany  South  of 
(17) 

51N 

.485 

.310 

Finland  and  Northern  USSR 
(7) 

.581 

.366 

European  USSR 

.554 

.287 

(12) 


function  for  the  two  selected  months  was  obtained.  The  procedure  used  is  sum¬ 
marized  as: 

A  reporting  weather  station  was  selected  as  a  base  station  in  one  of  4  dif¬ 
ferent  geographic  areas  of  Northern  Europe.  Then,  joint  probability  of  occur¬ 
rence  tables  for  the  individual  base  stations  and  selected  reporting  sites  were 
generated  by  USAFETAC,  OL-A.  The  selected  sites  were  chosen  at  increasing  dis¬ 
tances  from  the  base  stations  (see  table  3-6).  Correlation  coefficients  were 
calculated  from  the  joint  occurrence  tables  using  the  tetrachoric  method 
described  in  Appendix  A. 


Table  3-6.  Station  Data  Processed  in  the  Calculation  of  the 
Spatial  Correlation  Decay  Functions. 

RANGE  OF  DISTANCE 

BASE  STATION  #  OF  SELECTED  SITES  FROM  BASE  STATION  (km) 
Frankfurt,  FRG  14  13-248 

Weiden,  FRG  8  80-278 

Kaufbeuren,  FRG  6  19-178 

Daugavpils,  USSR 


6 


83-392 


Overall,  a  total  of  34  cases  at  distances  of  13  to  392  km  were  used  to  develop 
the  modeled  spatial  correlation  decay  functions  of  ceiling  and  of  visibility. 

3. 2. 5.1  Fitting  Observed  Spatial  Correlation  Data  to  Gringorten's  Model-B.  In 
paragraph  2.2.7  it  was  mentioned  that  Gringorten's  Model -B  has  been  used  quite 
successfully  in  modeling  the  spatial  correlation  decay  function  of  ceiling  and 
visibility.  Recalling  equations  (2-41)  and  (2-42),  Gringorten's  Model-B  is 
defined  by, 

r  =  r  (d,  D)  =  — —  [cos"1  o  -  ) 

71 

where , 

o  =  d  /  (128  D) 

In  these  equations,  d  is  the  distance  between  locations  and  D  is  an  appropriate 
scale  distance  for  the  particular  variable  and  geographic  area.  The  goal  then  is 
to  obtain  the  value  D  when  r  and  d  are  known  from  the  observed  data.  This  is 
accomplished  by  a  simple  least  squares  fit  of  the  observed  data.  The  resultant 
scale  distance  is  that  scale  distance  which  minimizes  the  RMS  difference  between 
the  empirical  correlation  figures  and  the  Gringorten  Model-B  correlation  decay 
function.  Once  again,  a  model  is  merely  a  simplification  of  complex  reality. 
The  object  here  is  to  find  a  value  for  the  Gringorten  Model-B  scale  distance  that 
produces  a  representative  spatial  correlation  decay  function.  This  representa¬ 
tive  function  will  not  necessarily  fit  the  observed  data  perfectly.  Figure  3-6 
is  a  comparison  of  Gringorten's  Model-B  and  spatial  correlation  coefficients  cal¬ 
culated  from  July  visibility  data,  for  the  34  station  pairs.  The  Gringorten 
Curve  is  for  a  scale  distance  of  3.74  kilometers  which  was  the  "best  fit"  line 
through  all  the  observed  data  points.  The  figure  illustrates  one  of  the  weak¬ 
nesses  of  Gringorten's  model.  That  is,  the  spatial  correlation  coefficient  from 
the  model  decreases  to  zero  more  rapidly  then  the  correlation  coefficients  calcu¬ 
lated  from  observed  data.  It  does  fit  the  observed  data  quite  well  at  distances 
less  than  300  kilometers. 

Equations  (2-45)  and  (2-46)  are  used  to  convert  a  given  Gringorten  Model-B 
scale  distance  to  maximum  and  minimum  allowable  wavelengths  for  the  sawtooth  wave 
generator  to  use.  These  equations  are, 

W  (upper)  *  560  *  SD 


W  (lower)  =  205  *  SD 


200 


DISTANCE  (km) 


Figure  3-6.  Plot  of  Gringorten's  Model-B  vs.  spatial 
Correlation  Coefficients  Calculated  from  Observed 
July  visibility  Data.  The  Gringorten  curve  is  for  a 
scale  distance  of  3.74  km  and  the  observed  data  is 
for  34  station  pairs  in  Northern  Europe. 


3. 2. 5. 2  Wavelengths  for  Ceiling.  Table  3-7  summarizes  the  results  of  the  spa¬ 
tial  correlation  study  for  ceiling  over  Northern  Europe  The  recommended  saw¬ 
tooth  wavelengths  that  will  create  representative  spatial  correlation  decay 
functions  for  ceiling  over  Northern  Europe  are,  W( lower)  =  625  kilometers  and 
W( upper)  =  1708  kilometers  for  January,  and  W( lower)  =  646  kilometers  and 
W( upper)  =  1764  kilometers  for  July. 

3. 2. 5. 3  Wavelengths  for  Visibility.  Table  3-8  summarizes  the  results  of  the 
spatial  correlation  study  for  visibility  over  Northern  Europe.  The  recommended 
sawtooth  wavelengths  that  will  create  representative  spatial  correlation  decay 
functions  for  visibility  in  Northern  Europe  are,  W( lower)  =  485  kilometers  and 
w(upper)  =  1327  kilometers  for  January,  and  W(lower)  =  767  kilometers  and 
W( upper)  =  2094  kilometers  for  July. 

3.2.6  Constants  for  the  Serial  Correlation  Exponential  Decay  Function 

3. 2. 6.1  General  Form  of  the  Exponential  Decay  Function.  The  extent  to  which 
meteorological  observations  valid  at  time  t  +  At  are  related  to  observations  at 
time  t  is  approximately  a  first  order  Markov  process.  In  general,  serial  corre¬ 
lation  of  ceiling  and  visibility  decreases  as  the  time  step  At  between  observa- 
tir ’s  increases.  Often  the  data  support  an  exponential  decay  of  p(t)  such  «-*?  the 
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model  presented  in  paragraph • 2 . 2 .  A  more  general  form  of  the  serial  correlation 
exponential  decay  model  is: 

p(t)  =  CONST  At  (3-20) 

Where  At  is  the  time  in  hours  between  observations  and  CONST  is  an  empirical  con¬ 
stant  derived  from  observed  data.  CONST  must  range  from  0  to  1.  In  this  model 
as  At  approaches  0,  p(t)  approaches  1,  and  the  two  observations  agree  perfectly. 
As  At  increases  (say  to  48  hours),  then  p(t)  approaches  0,  and  the  two  observa¬ 
tions  approach  independence. 

3. 2. 6. 2  Fitting  Observed  Serial  Correlation ^ata  to  the  Exponential  Decay  Model. 
To  estimate  a  representative  serial  correlation  decay  function  for  Europe,  a 
study  of  12  stations  was  conducted.  The  procedure  used  is  siumaarized  as: 

Twelve  locations  were  chosen  from  available  weather  stations  across  Europe. 
These  sample  stations  included  locations  in  England,  France,  Sweden,  Denmark, 
West  Germany,  East  Germany,  Poland,  and  the  Soviet  Union.  An  effort  was  made  to 
select  stations  with  various  geographic  characteristics.  The  prime  factor  in  the 
selection  process  however,  was  the  fact  that  each  station's  data  tape  had  at 
least  nine  years  of  surface  weather  observations  with  very  few  missing 
observations . 

The  Ceiling/Visibility  Simulation  Model  produces  a  temporally  and  spatially 
correlated  time  series  of  ENDs  of  ceiling  and  visibility  for  selected  locations. 
These  ENDs  are  then  transformed  into  raw  values  of  ceiling  and  visibility.  The 
entire  process  involves  assuming  that  the  individually  normalized  variables  are 
multivariate  normally  distributed.  To  remain  consistent  with  this  basic  assump¬ 
tion  of  multivariate  normal  distribution  the  serial  correlation  of  the  ENDs  of 
ceiling  and  visibility  was  studied  rather  than  the  correlation  of  the  raw  ceiling 
and  visibility  data. 

DATSAV  data  tapes  are  arranged  by  station.  For  example,  one  tape  contains 
all  surface  weather  data  for  Dusseldorf,  FRG  for  the  period  of  record  1  Jan  73  to 
31  Dec  82.  The  processing  of  each  tape  proceeded  as  follows.  Step  1  involved  an 
initial  pass  through  the  data  to  determine  the  cumulative  distribution  functions 
of  ceiling  and  visibility  for  eight  different  three-hour  time  periods  in  January 
and  July. 

These  CDFs  were  then  regressed  on  the  Weibull  modeling  functions  described 
earlier  in  this  technical  note  (Step  2).  Data  tape  was  then  rewound  and  Step  3 
involved  converting  each  raw  value  of  ceiling  and  visibility  to  ENDs  using  the 
modeling  coefficients  determined  in  Step  2.  Serial  correlation  coefficients  at 
3-,  6-,  9-,  12-,  15- ,  18- ,  21- ,  and  24-hour  time  lags  were  calculated  from  these 
ENDs  using  the  Pearson  product  moment  method  described  in  Appendix  A  (Step  4). 
In  this  manner  empirical  serial  correlation  decay  functions  were  determined  for 


ceiling  and  visibility,  in  January  and  July,  for  each  of  the  12  selected 
locations . 


Recalling  equation  (3-20)  the  goal  then,  is  to  obtain  a  value  for  CONST  when 
p(t)  and  At  are  known  from  these  observed  data.  This  is  accomplished  by  a  simple 
least  squares  fit  of  the  empirical  data.  Figure  3-7  is  a  comparison  of  the  expo¬ 
nential  decay  model  and  serial  correlation  coefficients  calculated  from  January 
Dusseldorf  ceiling  data.  The  modeled  function  is  for  a  decay  constant  of  0.926. 
Representative  decay  constants  were  determined  by  taking  the  average  serial  cor¬ 
relation  coefficients  for  all  12  locations  at  eaoh  time  lag,  and  then  fitting 
this  average  correlation  decay  function  to  the  model  (see  table  3-9). 
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Figure  3-7.  Plot  of  the  Exponential  Decay  Function 
vs.  Serial  Correlation  Coefficients  Calculated  from 
Observed  January  Ceiling  Data  for  Dusseldorf,  FRG. 

The  exponential  decay  curve  is  for  a  constant  of  0.926. 


Table  3-9.  Average  Serial  Correlation  Coefficients  for  Ceiling 
and  Visibility  for  Each  Time  Lag. 


TIME-LAG 


CEILING 

JANUARY 


VISIBILITY 
JANUARY 


0.706 
0.530 
0.443 
0.377 
0.308 
0.278 
0.252 
0.229 


0.644 

0.430 

0.446 

0.365 

0.315 

0.304 

0.287 

0.244 


0.686 

0.563 

0.495 

0.372 

0.368 

0.298 

0.265 

0.238 


0.769 

0.598 

0.498 

0.430 

0.387 

0.348 

0.320 

0.272 


OPTIMUM 
DECAY  CONSTANT 


0.921 


0.921 


0.932 


0.937 


RMS  DIFFERENCE 


0.06 


0.09 


0.06 


0.05 


3.2.7  initial  Conditions.  A  Harkov  process  is  a  probabilistic  model  for  a  con¬ 
tinuous  physical  system  from  which  a  sample  (xt,  x2,  ....  xR)  is  available.  The 
Markov  process  is  characterized  by  the  fact  that  the  state  of  the  system  at  tt 
depends  only  on  the  state  observed  at  tQ  and  not  the  path  by  which  the  state  at 
tQ  was  reached.  The  Ornstein-Uhlenbeck  equation  used  in  this  model  is  a  continu¬ 
ous  form  of  a  first-order  Markov  process.  Initial  conditions  for  the  ENDS  of 
ceiling  and  visibility  must  be  specified  to  the  model  before  the  simulation  can 
proceed. 

The  initial  conditions  for  WEASIM  must  be  read  into  the  labelled  COMMON  block 
/ENDFLD/ .  There  are  two  arrays  in  this  COMMON  block,  CQ  (for  the  ENDs  of  ceil¬ 
ing)  and  VQ  (for  the  ENDS  of  visibility).  These  arrays  are  dimensioned  to  accom¬ 
modate  ENDs  for  a  maximum  of  200  locations.  The  position  of  the  initial  ENDs 
within  the  arrays  should  be  in  the  same  order  that  the  Weibull  modeling  coeffi¬ 
cients  will  be  read  into  the  labelled  COMMON  block  /COEF/. 

3. 2. 7.1  initialization  Example.  To  illustrate  how  the  ENDs  might  be  entered 
into  the  model,  consider  this: 

A  user  wishes  to  specify  the  following  initial  conditions  of  Kitzingen  AAF,  FRG. 
It  is  January,  0300  GMT.  The  ceiling  is  400  feet  and  the  visibility  1.2  SM.  The 
Weibull  modeling  coefficients  at  Kitzingen  AAF  in  January  at  0300  GMT  are: 

a  (ceiling)  =  1032.28795 
8  (ceiling)  =  -0.90926268 

a  (visibility)  =  0.06526484 
P  (visibility)  =  1.50036855 

Equation  (3-1)  is  used  to  convert  the  threshold  visibility  (xT)  to  the  probabil¬ 
ity  that  X  is  less  than  xT- 

P  =  1  -  exp ( -a  xTP  ) 

=  1  -  exp(  -0,06526484  •  1<21'50036855  ) 

P  =  0.082 

The  END  corresponding  to  the  probability  (that  X  is  less  than  or  equal  to  x^)  of 
0.082  is  -1.39198589.  The  END  that  corresponds  to  a  given  probability  is  found 
through  rational  approximation.  USAFETAC  uses  an  algorithm  found  in  Abramowitz 
and  Stegun,  1964. 

The  same  process  is  followed  for  the  ceiling  values.  Equation  (3-12)  is  used  to 
convert  the  threshold  ceiling  (xT)  to  the  probability  that  X  is  less  than  xT< 


P  =  exp  l -a  xTP  ) 

=  exp( -1032. 28795  •  400 . "° ' 90926268  ) 

=  0.012 

The  END  corresponding  to  the  probability  (that  X  is  less  than  or  equal  to  Xy)  of 
0.012  is  -2.25757027. 

Thus,  -1.39198589  would  be  placed  in  the  appropriate  element  of  array  VQ  and 
-2.2575702  in  the  appropriate  element  of  array  CQ  and  the  model  would  be  ready  to 
proceed.  At  the  beginning  of  each  time  step  the  values  of  array  VQ  and  CQ  re¬ 
flect  current  conditions  (time  tQ).  The  0-U  process  acts  on  these  arrays  and  at 
the  end  of  each  time  step  arrays  VQ  and  CQ  are  updated  and  represent  the  condi¬ 
tions  at  time  tt . 

3. 2. 7. 2  Initialization  Options.  USAFETAC  has  provided  an  initialization  subrou¬ 
tine  to  HQ  USAF/SA  called  INTEND.  This  routine  gives  the  user  five  options  for 
specifying  initial  ceiling  and  visibility  conditions  at  each  location.  These 
options  are: 


Option  1  -  The  END  arrays  for  ceiling  and  visibility  at 
all  locations  initialized  to  zero  (th  0.50  probability 
level ) . 

Option  2  -  The  END  arrays  for  ceiling  and  visibility  at 
all  locations  are  initialized  to  some  specified 
probability  level. 

Option  3  -  The  END  arrays  for  ceiling  and  visibility  are 
initialized  by  a  random  process  in  a  manner  that  maintains 
the  spatial  correlation  of  each  variable  separately  as 
well  as  the  cross-variable  correlation. 

Option  4  -  Each  element  of  the  END  arrays  is  initialized 
to  a  value  corresponding  to  a  specified  probability  level. 

Option  5  -  Each  element  of  the  END  arrays  is  initialized 
to  a  value  corresponding  to  a  specified  threshold  ceiling 
and  visibility. 


Additional  options  may  be  added  later.  Initialization  can  be  accomplished  off 
line  or  as  part  of  the  using  simulation.  Input  to  subroutine  INTEND  is  handled 
through  the  calling  argument  and  consists  of  the  desired  option  number.  If 
options  4  or  5  are  requested  then  an  input  disk  or  tape  file  will  be  required. 
The  output  from  INTEND  is  handled  through  the  COMMON  blocks  /ENDFLD/  and  /OBS/. 
COMMON  area  /ENDFLD/  will  contain  the  ENDS  for  ceiling  and  visibility  at  each 
location  COMMON  block  /OBS/  will  contain  the  raw  variables  for  each  location. 
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Data  that  are  output  from  WEASIM  are  found  in  the  labelled  COMMON  block 
/OBS/.  There  are  two  arrays  in  this  COMMON  block,  CT  and  VT. 

3.3.1  Ceiling.  The  synthetic  ceiling  for  each  location  in  the  simulation  is 
found  in  array  CT.  This  array  is  dimensioned  to  accommodate  observations  for  a 
maximum  of  200  locations. 

The  position  of  each  synthetic  ceiling  is  determined  by  the  order  the  Weibull 
modeling  coefficients  for  each  location  were  read  into  the  labelled  COIMON  block 
/COEF/.  The  unit  of  measurement  for  the  ceiling  values  is  feet,  but  may  be 
converted  to  meters  (m)  by  applying  the  correction  factor  of  0.3048  m/ft. 

CT  (m)  =  CT(ft)  •  0.3048  m/ft  (3-21) 

The  values  in  the  array  CT  may  be  changed,  since  only  the  EMDs  of  ceiling  are 
needed  by  the  O-U  process  for  the  next  time  step. 

3.3.2  Visibility.  The  synthetic  visibility  for  each  location  in  the  simulation 
is  found  in  the  array  VT.  This  array  is  dimensioned  in  the  same  manner  as  CT; 
i.e.,  it  can  accommodate  one  observation  for  each  location  in  the  input  Weibull 
coefficient  file.  A  maximum  of  200.  The  position  of  each  synthetic  visibility 
value  is  determined  by  the  order  the  coefficients  for  each  location  were  read 
into  COMMON  block  /COEF/.  The  unit  of  measurement  for  the  visibility  values  is 
statute  miles,  but  may  be  converted  to  meters  by  applying  the  correction  factor 
of  1609.34  m/SM. 

VT(m)  =  VT(SM)  •  1609.34  m/SM  (3-22) 

The  values  in  the  array  VT  may  be  changed  since  only  the  ENDS  of  visibility  are 
needed  by  the  O-U  process  for  the  next  time  step. 

3.3.3  Masking.  The  values  of  ceiling  and  visibility  returned  by  WEASIM  within 
COMMON  block  /OBS/  are  in  continuous  form.  For  most  applications  this  form  is 
sufficient. 

In  nature,  ceiling  and  visibility  are  continuous  functions.  To  facilitate 
the  taking  and  transmitting  of  observations  by  weather  networks,  codes  have  been 
developed.  These  codes  do  not  allow  for  ceiling  and  visibility  to  be  reported  in 
continuous  form.  Network  observers  reduce  the  continuous  functions,  ceiling  and 
visibility,  to  discretized  values  based  on  the  applicable  code  they  are  follow¬ 
ing.  When  the  distributions  of  ceiling  and  visibility  are  fit  to  the  Weibull 
modeling  functions,  they  are  restored  to  their  original  continuous  form. 


A  user  of  WEASIM  has  the  option  of  specifying  which  for*  the  output  ceiling 
and  visibility  will  have  (continuous  vs.  discrete).  This  is  accomplished  through 
the  variable  NMASK  in  the  calling  sequence.  If  the  control  parameter  NMASK  is 
set  to  zero  then  a  reporting  mask  in  accordance  with  the  Federal  Meteorological 
Handbook  Humber  1  is  applied.  if  NMASK  is  set  to  any  other  value  no  mask  is 
applied  and  the  output  observations  will  be  in  continuous  form  such  as  11328. 3654 
feet  for  ceiling,  or  3.267329  SM  for  visibility. 

A  potential  user  of  WEASIM  should  consider  what  functions  the  synthetic 
observations  are  trying  to  simulate.  If  the  synthetic  observations  are  being 
used  for  straight  scoring  then  no  mask  should  be  applied.  However,  if  the 
purpose  of  the  synthetic  observations  is  to  simulate  possible  observations  from  a 
network  of  observers  then  the  mask  should  be  applied. 


Chapter  4 


MODEL  VERIFICATION  TESTS 


4.1  General 

In  chapter  2,  the  requirements  stipulated  for  the  Ceiling/Visibility  Simula¬ 
tion  Model  were  summarized.  This  chapter,  discusses  how  well  the  model  meets 
these  requirements.  The  verification  tests  that  tre  employed  in  this  chapter 
were  designed  to  test  the  results  at  two  levels.  The  verification  tests  were 
performed  on  the  output  of  the  model  in  both  END  and  raw  value  form.  These 
values  are  found  in  the  labelled  COMMON  blocks  /ENDFLD/  (the  ENDS)  and  /OBS/  (the 
raw  values  of  ceiling  and  visibility  after  the  ENDs  are  subjected  to  the  inverse 
transnormalization  process). 

The  first  group  of  tests  was  designed  to  verify  whether  the  model  preserves 
spatial,  temporal,  and  cross-variable  correlation  in  END  space.  The  mathematics 
of  the  O-U  process  guarantee  that  the  input  correlations  will  be  preserved,  as 
long  as  the  sawtooth  wave  submodel  is  indeed  producing  fields  of  random  numbers 
with  a  mean  of  zero  and  a  standard  deviation  of  one.  In  paragraph  2.3.1,  enti¬ 
tled  Stochastic  Assumptions ,  the  mean  and  standard  deviation  of  the  random  number 
fields  produced  by  the  sawtooth  wave  submodel  were  tested.  The  results  of  that 
test  verified  that  m  =  0  and  o  -  1  were  valid  assumptions.  Since  the  O-U  process 
guarantees  that  the  desired  correlations  will  be  preserved,  the  results  of  the 
tests  in  paragraph  4.2  mainly  act  to  verify  the  correctness  of  the  programming  of 
the  model. 

The  second  group  of  tests  was  designed  to  illustrate  the  model's  ability  to 
reproduce  the  unconditional  probability  distributions  and  the  various  joint  prob¬ 
ability  distributions  of  ceiling  and  visibility,  after  the  generated  ENDs  have 
been  transformed  to  raw  values  of  those  variables.  It  must  be  mentioned  here 
that  the  reproduction  of  the  joint  probabilities  of  raw  values  of  ceiling  and/or 
visibility  is  not  necessarily  guaranteed  by  the  mathematics  of  the  model.  There 
are  no  assurances  that  the  joint  probability  distribution  of  two  variables,  which 
are  individually  transnormalized,  is  multivariate  normal. 

4.2  Correlation  Tests 

The  computations  within  these  paragraphs  were  performed  on  the  ENDs  produced 
by  the  model.  These  ENDs  are  produced  by  the  O-U  process  in  conjunction  with  the 
sawtooth  wave  submodel.  The  correlation  structure  of  the  ENDs  are  a  function  of 
the  various  input  correlation  parameters  described  in  chapter  3.  The  results  are 
independent  of  the  inverse  transnormalization  process  that  produces  the  actual 
values  of  synthetic  ceiling  and  visibility. 
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4.2.1  Serial  and  Cross-variable  Correlation.  Lengthy  runs  of  WEASIM  were  made 
to  test  whether  the  serial  and  cross-variable  correlation  behavior  predicted  by 
equations  (2-11)  and  (2-30)  were  shown.  Results  for  runs  of  10,000  and  100,000 
simulated  hourly  observations  are  shown  in  table  4-1.  The  correlation  coeffici¬ 
ents  were  calculated  from  the  ENDS  of  ceiling  and  visibility  using  the  Pearson 
product  moment  formula  presented  in  Appendix  A.  Bracketing  values  in  parentheses 
represent  the  95-percent  confidence  limits  for  each  input  correlation  coeffici¬ 
ent,  determined  according  to  the  procedures  in  the  following  paragraphs. 

Correlation  coefficients  (r)  that  are  calculated  from  sample  data  (whether 
historical  data  or  data  generated  by  a  simulation  model  such  as  WEASIM)  are  sub¬ 
ject  to  sampling  variability.  The  distribution  of  the  sample  correlation  coeffi¬ 
cients  is  not  normal  but  approaches  normality  as  the  sample  size  increases.  The 
approach  of  the  distribution  of  r  to  normality  depends  not  only  on  sample  size 
but  on  the  value  of  the  population  correlation  p.  If  samples  are  drawn  from  a 
population  for  which  p  =  0,  the  distribution  is  approximately  normal,  approaching 
normality  rather  slowly  as  the  sample  size  increases.  In  this  case,  Student's 
t-Distribution  or  the  normal  distribution  is  used  in  testing  inferences  about  p. 
If  samples  are  drawn  from  a  population  for  which* p  t  0,  the  distribution  of  r  is 
very  skewed.  When  p  is  greater  than  zero,  the  skewness  tends  toward  the  left, 
with  high  values  of  r  being  relatively  more  probable  than  lower  values.  The 
skewness  is  reversed  for  p  less  than  zero.  This  complicated  dependency  of  the 
sampling  distribution  of  r  on  the  value  of  p  makes  it  impossible  to  employ  the 
t-test  or  normal  distribution  directly.  To  permit  inferences  about  p  /  0,  R.  A. 
Fisher  developed  for  a  bivariate  normal  population  the  Z-transformation  given  by, 

Z  =  0.5  In  (  i-f-§  1  (4-D 

For  sample  correlation  coefficients  r  computed  from  independent  draws  from  a 
bivariate  normal  population  whose  correlation  is  p,  the  statistic  Z  is  approxi¬ 
mately  normally  distributed,  with  a  mean  given  by, 

mz  =  °-5  in  i  H-f  1  (4“2) 

and  a  standard  deviation  given  by, 

o2  =  VV(N-3)  (4-3) 

where  N  is  the  sample  size.  The  goodness  of  these  approximations  increases  with 
smaller  absolute  values  of  p  and  with  larger  sample  sizes  N. 


If  the  population  correlation  is  p  and  one  samples  from  it  repeatedly,  95 
percent  of  the  sample  correlation  coefficients  drawn  from  the  population  will 
fall  between  the  so  called  "95-percent  confidence  limits"  of  p.  Thus,  from  a 
single  value  of  r  that  happens  to  lie  within  those  limits,  one  can  infer  with 
only  a  5-percent  risk  of  error,  that  the  population  correlation  is  p.  More  pre¬ 
cisely,  with  only  a  5-percent  risk,  r  is  not  significantly  different  from  the 
stated  p . 

ENDs  generated  by  WEASIM  have  the  bivariate  normal  distribution,  but  if  all 
the  simulated  ceiling  and  visibility  observations  produced  by  WEASIM  are  included 
in  the  sample  used  for  calculating  correlations,  then  data  have  not  been  sampled 
independently,  and  a  correction  must  be  made  accordingly.  If  a  correction  is 
made  for  serial  dependency,  it  is  possible  to  make  hypotheses  about  the  correla¬ 
tions  p  in  the  model. 

Hypothesize  that  a  population  correlation  of  WEASIM  is  p  ?  0  and  calculate 
the  95-percent  confidence  limits  about  p  based  on  sampling  the  WEASIM  process  N 
consecutive  times.  To  correct  for  serial  dependency  in  the  generated  time  series 
of  WEASIM  observations,  equation  (2-90)  is  used  to  compute  the  effective  sample 
size  N' .  Fisher's  Z-statistic  is  calculated  using  equation  (4-2)  and  the  stand¬ 
ard  deviation  is  computed  by  equation  (4-4). 

0Z  =  n/arrz)  (4-4) 

The  95-percent  confidence  limits  in  Z  space  are  given  by, 

Z,  =  Z  +  1 . 96o  (4-5) 

u  z 

Z,  =  Z  -  1 . 96o  (4-6) 

1  z 

An  inverse  Fisher  Z- transformation  is  used  to  convert  the  confidence  limits  in 
the  Z  domain  to  confidence  limits  in  the  p  domain: 

exp(  2Z  )  -  1 

.  _  u _  (4-7) 

pu  exp(  2ZU  )  +  1 

exp(  2Z,  )  -  1 

P1  =  exp(  2ZX  )  +  1  (4-8) 

Fisher’s  Z-transform  can  also  be  expressed  in  the  form  of  the  hyperbolic  tangent. 
Multiplying  the  numerator  and  denominator  of  equation  (4-7)  by  e  gives, 


p 


~  tanh  (2) 


(4-9) 


Z  =  tanh-1  (p)  (4-10) 

Results  in  table  4-1  show  that  the  sample  correlation  coefficients  produced 
by  WEASIM  fall  within  the  95-percent  confidence  limits  of  the  input  (hypothe¬ 
sized)  correlations.  The  model  appears  to  preserve  the  serial  correlation  pcc  of 
the  ceiling,  the  serial  correlation  pw  of  the  visibility,  and  the  cross¬ 
correlation  pcv  of  ceiling  and  visibility.  In  addition,  the  cross-lag  correla¬ 
tion  Pvtct+1  does  appear  to  reduce  to  the  automatic  correlation  P^P^- 

4.2.2  Spatial  Correlation.  Test  runs  were  made  to  verify  that  WEASIM  preserves 
spatial  correlation  decay  functions  similar  to  Gringorten' g  Model-B.  In  each  of 
these  spatial  correlation  tests  five  locations  were  chosen  at  random.  The  model 
was  allowed  to  generate  5,000  ENDs  for  ceiling  and  visibility  at  each  of  the 
locations.  Sawtooth  wavelengths  were  chosen  for  each  run  to  represent  various 
Gringorten  scale  distances  (equations  2-45  and  2-46).  The  time  increment  between 
observations  was  chosen  to  be  24  hours  to  minimize  the  amount  of  serial  correla¬ 
tion,  thus  maximizing  the  effective  sample  size  of  each  test. 

The  actual  geographic  location  of  each  of  the  randomly  selected  sites  was  not 
of  major  concern  for  these  tests.  Rather,  the  distance  between  each  of  the  sites 
was  the  most  important  factor  in  determining  the  expected  (Gringorten' s  Model-B) 
and  returned  (WEASIM)  correlation  coefficients.  Table  4-2s  and  4-5  list  the 
coordinates  of  the  five  sites  chosen  for  two  of  these  spatial  correlation  tests. 
These  tables  also  show  the  great  circle  distances  in  kilometers  between  each  of 
the  location  pairs  as  computed  by  the  model.  Spatial  correlation  coefficients 
for  the  ENDs  of  ceiling  and  visibility  were  calculated  for  each  of  the  station 
pairs  using  the  Pearson  product  moment  method  described  in  Appendix  A.  These 
calculated  coefficients  were  then  compared  to  correlation  coefficients  determined 
from  Gringorten' s  Model-B.  The  results  of  spatial  correlation  test  #1  are  shown 
in  table  4-3  (Ceiling)  and  table  4-4  (Visibility),  and  the  results  of  spatial 
correlation  test  #2  are  shown  in  tables  4-6  (ceiling)  and  4-7  (Visibility).  The 
left-most  values  in  these  tables  are  those  spatial  correlation  coefficients 
returned  from  WEASIM,  the  right-most  values  are  those  correlation  coefficients 
computed  by  Gringorten 's  Model-B  for  the  given  scale  distance  (which  the  model 
attempts  to  emulate).  The  bracketing  values  in  parentheses  represent  the  95- 
percent  confidence  limits  for  each  correlation  coefficient  predicted  by 
Gringorten's  Model-B  (the  hypothesized  distribution),  using  the  procedure 
described  in  detail  in  paragraph  4.2.1.  For  these  spatial  correlation  tests  the 
sample  size  N  was  5,000  observations  and  the  input  serial  correlation  was  0.258. 
The  effective  sample  size  N'  was  computed  to  be  2,949  independent  observations 
(using  equation  2-90). 
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Table  4-2.  Great  Circle  Distance  Between  Location  Pairs  for 
Spatial  Correlation  Test  1. 


GREAT 

CIRCLE  DISTANCE  (KM) 

BETWEEN 

SELECTED 

LOCATION  PAIRS 

LAT 

LON 

LOC  IND 

1 

2 

3 

4  i 

54.30 

-1.50 

1 

0. 

52.70 

-0.60 

2 

188. 

0. 

52.60 

-0.50 

3 

200. 

13. 

0. 

52.80 

0.80 

4 

226. 

95. 

90. 

0. 

52.80 

1.40 

5 

254. 

135. 

130. 

40.  0 

Negative  Values  for  Longitude  are  West  Longitude  Coordinates 

The  results  in  tables  4-3,  4-4,  4-6,  and  4-7  show  that  the  model  indeed  pre¬ 
serves  spatial  correlation  decay  functions  similar  to  Gringorten's  Model-B.  In 
fact,  for  the  wavelengths  equivalent  to  scale  distances  less  than  4.5  kilometers, 
the  correlation  coefficients  returned  by  the  model  fall  within  the  95-percent 
confidence  limits  computed  from  the  correlation  coefficients  from  Gringorten's 
Model-B.  For  these  shorter  wavelengths/scale  distances,  one  could  not  reject  the 
hypothesis  that  the  two  distributions  are  the  same.  If  the  goal  of  the  model  is 
to  preserve  spatial  correlation  decay  functions  similar  to  Gringorten's  Model-B 
for  scale  distances  greater  than  4.5  kilometers,  then  different  constants  should 
be  used  to  compute  the  passband  for  the  sawtooth  wavelengths  (see  paragraph 
2.2.7). 

The  reader  should  note  that  the  spatial  correlation  structure  of  the  fields 
of  ends  produced  by  WEASIM  conforms  to  some  desireable  preconceptions.  That  is, 
for  all  the  tests,  the  correlation  coefficients  decrease  exponentially  with 
distance,  at  least  for  short  distances,  and  drop  to  zero  in  larger  but  finite 
distances.  Based  on  these  results,  the  model  has  been  shown  to  preserve  spatial 
correlation  between  station  pairs,  in  a  manner  similar  to  Gringorten's  Model-B. 

4.3  Distribution  Tests 

The  computations  within  these  paragraphs  were  performed  on  the  distributions 
of  raw  values  of  synthetic  ceiling  and  visibility  produced  by  the  model.  These 
raw  values  of  synthetic  ceiling  and  visibility  represent  the  final  output  of 
WEASIM.  The  values  are  found  within  the  labelled  COMMON  block  /OBS/  after  each 
replication  of  the  model.  The  synthetic  observations  are  the  end  product  of  the 
sawtooth  wave  submodel,  the  0-U  process,  and  the  inverse  transnormalization 
process,  that  together  comprise  the  Ceiling/Visibility  Simulation  Model. 


4-3.  Results  of  Spatial  Correlation  Test  1  (Ceiling). 
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Table  4-5.  Great  Circle  Distance  Between  Location  Pairs  for 


Spatial 

Correlation 

Test  2. 

GREAT 

CIRCLE  DISTANCE  (KM) 

BETWEEN 

SELECTED 

LOCATION  PAIRS 

LAT 

LON 

LOC  IND 

1 

2 

3 

4  5 

46.80 

-68.10 

1 

0. 

51.60 

6.10 

2 

5182. 

0. 

50.80 

6.70 

3 

5258. 

98. 

0. 

50.00 

6.70 

4 

5297. 

183. 

89. 

0. 

49.90 

6.60 

5 

5296. 

192. 

100. 

13.  0 

Negative  Values  for  Longitude  are  West  Longitude  Coordinates 


4.3.1  Unconditional  Probability  Distributions.  Each  pair  of  coefficients  for 
the  Weibull  modeling  functions  describe  a  particular  cumulative  distribution 
function  for  ceiling  or  visibility.  When  specific  coefficient  sets  are  used  in 
the  inverse  transnormalization  process  of  the  Ceiling/Visibility  Simulation 
Model,  then  a  long  run  of  WEASIM  should  return  the  same  CDFs  as  defined  by  those 
coefficients,  within  the  accuracy  limits  imposed  by  sampling  error. 


To  test  this  hypothesis,  initially,  four  test  runs  of  WEASIM  were  made.  In 
each  sample  run,  5,000  observations  were  generated  for  each  of  five  locations. 
No  recording  mask  was  used  and  each  observation  fell  at  the  same  time  of  day 
(because  a  24-hour  time  step  was  used).  The  24-hour  time  step  was  chosen  to 
minimize  the  amount  of  serial  dependence  in  the  sample  population  and  insure  that 
the  same  coefficient  sets  were  used  for  the  entire  test.  During  each  of  the  test 
runs,  the  synthetic  values  of  ceiling  and  visibility,  generated  by  the  model, 
were  sorted  by  category  and  location  (see  table  4-8).  This  frequency  of  occur¬ 
rence  data  was  accumulated  for  later  statistical  calculations. 


At  the  conclusion  of  each  simulation  test  run,  the  relative  frequency  distri¬ 
butions  of  ceiling  and  visibility,  at  each  location,  were  computed  by  dividing 
the  raw  counts  in  each  of  the  seven  intervals  by  the  total  sample  size  N.  The 
cumulative  frequency  distribution  that  the  synthetic  ceiling  or  visibility  was 
less  than  the  selected  thresholds  was  then  easily  computed  from  the  relative 
frequency  distribution.  The  cumulative  frequency  that  the  ceiling  or  visibility 
is  less  than  the  upper  threshold  of  interval  J  is  given  by  the  equation. 


CDF(J)  =  1  RELF. 

i=l  1 


(4-11) 


Where  RELF^  is  the  relative  frequency  of  occurrence  in  each  interval  (see  table 
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Table  4-8.  Intervals  for  Ceiling  and  Visibility  Values  Used  in 
the  Unconditional  Probability  Distribution  Tests. 


INTERVAL 

CEILING 

im 

VISIBILITY  (sm) 

<  CIG 

< 

<  VSBY  < 

1 

0  - 

200 

0  -  0.5 

2 

200  - 

500 

0.5  -  1.0 

3 

500  - 

1,000 

1.0  -  2.0 

4 

1,000  - 

2,000 

2.0  -  3.0 

5 

2,000  - 

3,000 

3.0  -  4.0 

6 

3,000  - 

10,000 

4.0  -  6.0 

7 

10,000  - 

+00 

6.0  -  +® 

Table  4-9.  Sample  Relative  Frequency  to  Cumulative  Frequency 
Conversion.  The  relative  frequency  distribution  is  that  of 
synthetic  visibility  generated  for  Spangdahlem  AB,  FRG,  January, 


1200  GMT. 

INTERVAL 

VSBY 

JM1 

RELATIVE 

FREQUENCY 

WITHIN 

INTERVAL 

CUMULATIVE 
FREQUENCY 
LESS  THAN 
UPPER  THRESHOLD 

1 

Lower 

Threshold 

0 

Upper 

Threshold 

0.5 

0.083 

0.083 

2 

0.5 

1.0 

0.067 

0.150 

3 

1.0 

2.0 

0.120 

0.270 

4 

2.0 

3.0 

0.099 

0.369 

5 

3.0 

4.0 

0.083 

0.452 

6 

4.0 

6.0 

0.132 

0.584 

7 

6.0 

+  00 

0.416 

1.000 

After  each  of  the  four  test  runs  was  completed,  the  frequency  'distributions 
returned  by  the  model,  for  each  location,  were  then  compared  to  the  Weibull 
distributions  for  each  location.  The  results  of  these  comparisons  are  shown  in 
tables  4-10  through  4-13.  In  these  tables,  the  Weibull  distributions,  and  those 
distributions  returned  from  WEASIM,  are  listed  in  their  cumulative  distribution 
form,  i.e.,  the  cumulative  frequency  that  the  ceiling  or  visibility  was  less  than 
the  specifed  thresholds.  The  maximum  difference  listed  for  each  location  is  the 
maximum  absolute  difference  between  the  Weibull  and  the  generated  CDFs.  The 
chi-square  closeness  of  fit  test  was  performed  on  the  modeled  and  generated 
distributions,  for  each  location,  in  their  relative  frequency  form.  Equation 
(2-91)  was  used  for  the  chi-square  test  and  the  sample  size  was  corrected  for 
serial  dependence  using  equation  (2-90).  For  these  unconditional  probability 
tests,  the  sample  size  N  was  5,000  observations  and  the  input  serial  correlation 
was  0.258.  The  effective  sample  size  was  computed  to  be  2,949  independent 
observations . 
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-12.  Marginal  Distribution  Test  3.  Probabilities  are  the  percent  of  time  that  the  ceiling  and  visi 
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Table  4-13.  Marginal  Distribution  Test  4.  Probabilities  are  the  percent  of  time  that  the  ceiling  and  visi 
bility  is  less  than  or  equal  to  the  listed  thresholds. 

MONTH:  January  TIME:  1200  GMT  N  =  5,000  Observations 

N'  (Corrected  for  Serial  Dependence)  =  2949  Observations 
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The  hypothesis  to  be  tested  was  whether  the  unconditional  probability  distri¬ 
butions  of  ceiling  and  visibility  resulting  from  the  simulation  were  the  same  as 
those  probability  distributions  described  by  the  weibull  models,  within  the 
limits  of  sampling  error.  Figure  4-1  illustrates  the  percentage  points  of  the 
chi-square  distribution  for  six  degrees  of  freedom.  The  critical  chi-square 
value  for  the  0.05  level  of  significance  and  six  degrees  of  freedom  (7  possible 
intervals  -  1)  is  12.59.  One  would  expect  the  calculated  chi-square  values  to 
exceed  12.59  only  5  percent  of  the  time  if  the  simulated  and  Weibull  distribu¬ 
tions  are  the  same,  and  the  differences  were  from  sampling  error  alone.  Of  the 
40  cases  tested  initially,  the  calculated  chi-square  values  exceeded  12.59  three 
times  (7.5  percent).  To  strengthen  the  statistical  test,  six  additional  sample 
runs  were  made.  The  additional  tests  involved  the  same  model  run  parameters,  but 
were  for  different  locations  and  times.  The  additional  locati<~is  were  airfields 
in  West  Germany,  East  Germany,  Denmark,  and  the  Soviet  Union.  The  results  of  the 
additional  tests  are  not  listed  in  tabular  form.  Of  the  total  100  cases  involved 
in  the  unconditional  probability  of  ceiling  and  visibility  test,  the  calculated 
chi-square  values  exceeded  12.59  five  times  (5.0  percent).  Based  on  these 
results,  you  cannot  reject  the  null  hypothesis.  The  test  indicates  that  WEAS1M 
does  preserve  the  unconditional  probabilities  of  ceiling  and  visibility  as 
described  by  the  Weibull  functions,  within  the  limits  of  accuracy  imposed  by 
sampling  error. 
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Figure  4-1.  Percentage  Points  of  the  \2  Distribution 
for  6  Degrees  of  Freedom. 
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4.3.2  Ceiling/Visibility  Curve  Fit  Results.  in  the  previous  paragraphs,  it  was 
shown  that  WEASIM  does  reproduce  the  unconditional  probability  distributions  of 
ceiling  and  visibility  as  described  by  the  Weibull  functions.  The  question 
remains,  "How  close  do  the  Weibull  functions  come  to  describing  the  distributions 
of  ceiling  and  visibility,  at  various  locations,  as  reported  by  ground 
observers?" 

It  must  be  re-emphasized  here  that  weather  observations  contain  biases  intro¬ 
duced  by  coding  and  reporting  procedures.  At  some  east  European  airfields,  when 
ceiling  and  visibility  values  exceed  certain  critical  thresholds,  the  weather 
observers  do  not  report  the  actual  values  for  those  variables.  Rather,  the  crit¬ 
ical  threshold  is  reported.  For  example,  a  critical  theshold  for  ceiling  might 
be  3,000  feet.  When  the  actual  ceiling  exceeds  3,000  the  weather  observer  con¬ 
tinues  to  report  3,000  feet  until  the  next  critical  threshold  is  exceeded  (see 
figure  4-2).  This  figure  gives  a  plot,  in  histogram  form,  of  the  cumulative 
frequency  that  the  visibility  was  less  than  selected  thresholds  at  Baranovichi, 
USSR,  during  January,  2300-0100  GMT.  This  empirical  distribution  of  visibility 
is  based  on  observations  from  a  9-year  P0R  from  January  1973  to  December  1981. 
NOTE:  This  station  never  reported  visibility  values  greater  than  or  equal  to 
4  SM  during  this  period.  This  is  represented  by  the  fact  that  the  cumulative 
frequency  that  the  visibility  was  less  than  4  SM  is  1.00.  There  are  no  reasons, 
from  a  meteorological/physical  point  of  view,  that  values  of  4  SM  or  greater 
would  never  be  observed.  It  is,  apparently,  an  operational  (unpublished)  prac¬ 
tice  at  this  airfield  to  stop  reporting  values  for  visibility  once  those  values 
exceed  4  SM  (6400  m). 

The  curve  plotted  in  figure  4-2  represents  the  modeled  function  obtained  from 
regressing  the  empirical  distribution  on  the  Weibull  curve,  using  the  technique 
described  in  paragraph  3. 2. 1.3,  this  technical  note.  The  Weibull  coefficients 
obtained  from  the  regression  were  a  =  0.20110328  and  p  =  1.44501913.  Although 
there  was  a  relatively  large  RMS  difference  between  the  two  fits  (0.075)  and  the 
maximum  difference  was  0.225  at  4  SM,  there  is  good  reason  to  believe  that  the 
modeled  distribution  (Weibull  curve)  describes  reality  more  closely  than  the 
observed  (empirical)  distribution. 

In  addition  to  operational  practices,  the  mere  act  of  coding  meteorological 
data  causes  some  problems.  There  are  many  different  types  of  meteorological 
codes  currently  in  use  around  the  world.  Each  of  these  codes  was  designed  to 
facilitate  the  reporting  and  transmitting  of  weather  observations.  Intrinsic  in 
the  coding  of  weather  observations  is  the  act  of  reducing  the  continuous  varia¬ 
bles  ceiling  and  visibility,  to  discretized  values.  Some  of  the  codes  reduce  the 
values  for  ceiling  and  visibility  to  relatively  small  numbers  of  coded  values 
(see  figure  4-3).  The  coding  procedures  for  observations  at  Cottbus,  GDR,  and 
the  decoding  produres  at  AFGWC,  allow  for  only  11  different  values  of  ceiling  to 
be  reported.  The  resultant  CDF,  derived  from  the  observations  at  Cottbus,  is  not 
quite  as  smooth  a  distribution  as  a  CDF  derived  from  observations  at  a  weather 
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Figure  4-2.  Observed  and  Modeled  CDFs  for  Visibility 
at  Baranovichi,  USSR,  January,  2300-0100  GMT. 

station  whose  procedures  might  allow  for  many  more  coded  values  (figure  3-3  is  an 
example  of  a  distribution  derived  from  a  weather  station  whose  reported  ceiling 
heights  could  be  more  than  30  different  values).  Empirical  CDFs  derived  from  the 
more  limiting  codes  may  still  be  fit  to  the  modeling  functions  using  the  regres¬ 
sion  techniques  described  in  paragraph  3.2,  this  technical  note.  Although  the 
resultant  regression  lines  will  follow  the  general  shape  of  these  empirical  dis¬ 
tributions,  sizeable  differences  will  result  between  the  observed  and  modeled 
CDFs  at  several  threshold  values  because  these  values  are  never  reported. 

Strictly  speaking,  because  of  the  incongruous  nature  of  the  observing  prac¬ 
tices  from  country  to  country,  RMS  differences  between  the  observed  and  modeled 
CDFs  can  not  give  an  absolute  measure  of  the  closeness  of  fit.  However,  RMS  dif¬ 
ferences  can  still  give  a  potential  user  of  WEASIM  an  idea  of  the  magnitude  of 
the  "errors"  (departures  from  the  observed  distributions)  that  will  result  when 
using  coefficient  sets  from  specific  areas.  Table  4-14  summarizes  the  results  of 
over  16,000  regressions  that  comprised  the  European  ceiling/visibility  coeffi¬ 
cient  sets  that  USAFETAC  provided  to  HQ  USAF/SA.  This  table  lists  the  average 
RMS  differences  of  all  the  regressions  for  ceiling  and  visibility,  broken  down  by 
month  and  geographic  area.  Despite  the  biases,  bumps,  and  spikes  in  the  empirical 
distributions,  overall,  the  modeled  CDFs  come  lairly  close  to  the  observed  CDFs. 
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Figure  4-3.  Observed  and  Modeled  CDFs  for  Ceiling 
at  Cottbus,  GDR,  January,  1400-1600  GMT. 


The  results  in  Table  4-14  show  that  the  average  RMS  diffences  are  lower  for  the 
visibility  curve  fits  then  they  are  for  ceiling.  Additionally,  the  average  RMS 
differences  for  both  the  ceiling  and  visibility  curve  fits  are  smaller  in  July 
then  they  are  in  January.  Table  4-15  gives  a  slightly  different  summary  of  the 
RMS  differences  of  the  individual  regressions.  This  tables  lists  the  percentage 
of  the  total  number  of  cases  in  which  the  RMS  difference  exceeded  0.03,  0.05,  and 
0.07. 

The  results  in  Table  4-15  show  that  the  RMS  differences  for  visibility  are 
almost  always  quite  small.  In  the  "worst  weather"  month  of  January,  only  20.5 
percent  of  the  curve  fits  had  RMS  values  greater  than  0.03.  Most  of  these  cases 
were  in  the  Soviet  Union  and  the  Balkan  countries,  and  were  due  mainly  to  the 
operational  practice  discussed  in  the  last  few  paragraphs.  In  July,  only  6.5 
percent  of  the  regressions  for  visibility  had  RMS  values  greater  than  0.03. 

Table  4-15  illustrates  the  observed  distributions  of  ceiling  are  more  diffi¬ 
cult  to  describe  with  a  modeling  function,  than  the  observed  distributions  of 
visibility.  Observing  and  reporting  cloud  ceiling  is,  at  times,  far  more  sub¬ 
jective  than  reporting  visibility.  This  subjectivity,  coupled  with  the  problems 
that  go  with  encoding  the  data,  cause  the  observed  distributions  of  ceiling  to  be 
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less  smooth,  more  variable,  than  the  observed  distributions  of  visibility.  Thus 
the  higher  RMS  values  result.  The  largest  RMS  differences  are  for  the  regres 
sions  of  Scandanavian  and  Warsaw  Pact  ceiling  data. 


Table  4-14.  Average  RMS  Differences  of  the  Curve  Fits  that  Created  the  European 
Ceiling/Visibility  Coefficient  Data  Base. 


GEOGRAPHIC 

AREA 

JANUARY 

JULY 

#  OF 
FITS 

VISIBILITY 

CEILING 

#  OF 
FI  iS 

VISIBILITY 

CEILING 

Scandanavia 

1000 

0.018 

0.048 

856 

0  008 

0.032 

United  Kingdom 

1024 

0.016 

0.032 

1024 

0.0x2 

0.032 

Iceland/ 

Greenland 

32 

0.005 

0.025 

32 

0.005 

0.028 

Denmark 

144 

0.018 

0.042 

144 

0.009 

0.036 

Benelux/France/ 

Switzerland 

904 

0.020 

0.030 

668 

0.013 

0.023 

Spain/Azores 

56 

0.013 

0.026 

56 

0.005 

0.021 

East  Germany 

336 

0.028 

0.044 

336 

0.018 

0.028 

West  Germany 

1256 

0.020 

0.028 

1256 

0.011 

0.026 

Austria/ 

Czechoslovakia/ 

Poland 

776 

0.030 

0.048 

776 

0.014 

0.032 

Hungry/Romani a/ 
Bulgaria/Yugoslavia 

808 

0.044 

0.034 

624 

0.017 

0.022 

Italy/Greece 

872 

0.016 

0.038 

888 

0.015 

0.020 

Turkey/ 

Middle  East 

360 

0.012 

0.040 

360 

0.005 

0.020 

Soviet  Union 

800 

0.032 

0.052 

800 

0.014 

0.034 

United  States/ 

Canada 

120 

0.007 

0.013 

120 

0.004 

0.014 

OVERALL 

8488 

0.023 

0.038 

8160 

0.012 

0.027 

Overall,  considering  the  fact  that  the  observed  CDFs  for  both  ceiling  and 
visibility,  for  the  given  months  and  time  periods,  are  constructed  from  rela¬ 
tively  small  sample  sizes  (most  were  100-300  observations),  a  great  majority  of 
the  individual  modeled  curves  fall  well  within  the  credibility  limits  of  the 
observed  distributions. 


Table  4-15.  Percentage  of  the  Total  Number  of  Curve  Fits  that 
the  RMS  Difference  Exceeded  Selected  Values.  The  numbers  in 
parentheses  are  the  total  number  of  fits  for  that  month. 


MONTH/VARIABLE 

RMS  THRESHOLD  VALUES 

0.03 

0.05 

0.07 

January  (8488) 

Ceiling 

Visibility 

51.4 

20.5 

23.4 

5.9 

9.9 

3.1 

July  (8160) 

Ceiling 

visibility 

32.1 

6.5 

12.3 

2.0 

4.5 

0.9 

4.3.3  Joint  Probability  Distributions.  In  paragraph  2. 3. 2. 3  a  key  assumption 
made  by  this  model  was  discussed.  That  assumption  was  the  joint  distributions  of 
ceiling  and/or  visibility,  in  space  and  in  time,  represent  an  underlying  multi¬ 
variate  normal  distribution.  Verification  of  this  assumption  poses  an  unmanage¬ 
ably  large  problem.  For  example,  consider  the  following  hypothetical  problem.  A 
simulation  involving  five  locations  is  to  be  run.  The  number  of  joint  distri¬ 
butions  to  be  preserved/verified  in  this  problem  is: 


Joint  Distribution  # 


Ceiling  vs.  Visibility  at  each  location  5 
Ceiling  at  each  location  pair  10 
Visibility  at  each  location  pair  10 
CIG  at  time  t0  vs.  CIG  at  time  tj  at  each  location  5 
vsby  at  time  t0  vs.  VSBY  at  time  tt  at  each  location  5 


This  list  is  by  no  means  exhaustive,  but  the  magnitude  of  the  problem  is*evident. 
For  this  relatively  small  problem  at  least  35  joint  probability  tables  could  be 
generated  for  verification.  When  1000+  locations,  two  seasons,  eight  time  peri¬ 
ods,  etc.,  are  considered,  literally  thousands  of  joint  distributions  could  be 
generated  and  compared  to  the  observed  distributions  (if  the  observed  joint  dis¬ 
tributions  were  available  in  the  first  place).  Ideally,  a  stringent  statistical 
test,  like  the  chi-square  test,  should  be  applied  to  the  various  joint  distribu¬ 
tions.  However,  this  would  require  a  rather  man/computer  intensive  exercise  to 
recover  the  raw  counts,  needed  for  the  chi-square  test,  from  the  historical 
weather  data.  Rather  than  commit  the  resources  neccessary  for  a  stringent  sta¬ 
tistical  test,  it  was  decided  to  present  the  results  of  several  case  studies  in  a 
limited  number  of  tables.  From  these  case  studies,  a  potential  user  of  WEASIM 
might  gain  some  insight  as  to  where  the  model  is  most  or  least  valid,  and  the 
magnitude  of  the  errors  that  might  be  expected. 
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4. 3. 3.1  Joint  Probability  of  Ceiling  and  Visibility  at  Individual  Locations.  By 
adjusting  the  cross-correlation  in  either  equation  (2-24)  (for  the  single 

station  case)  or  2-38  (for  multiple  locations),  it  is  possible  to  adjust  the 
joint  probabilities  of  ceiling  and  visibility  produced  by  WEASIM.  The  model  then 
could  produce  ~  by  Monte  Carlo  simulation  --  synthetic  joint  probabilities  tuned 
to  match  actual  distributions  as  represented  by  the  interior  of  RUSSWOs  or  N- 
Summaries.  To  repeat  what  was  stated  in  chapter  2,  if  ENDs  of  ceiling  and  visi¬ 
bility  were  distributed  exactly  according  to  a  multivariate  normal  distribution, 
and  if  no  biases  were  introduced  by  the  methods  used  to  observe  and  record  the 
weather,  then  WEASIM  could  produce  synthetic  RUSSWOs  differing  from  "natural" 
RUSSWOs  by  no  more  than  sampling  error.  Even  if  a  nearly  perfect  fit  were  at¬ 
tained  between  a  synthetic  and  a  “natural"  RUSSWO.  one  would  merely  be  tempted  to 
ask,  "Have  you  fitted  nature  or  have  you  fitted  an  inadequate  perception  of 
nature?" 

Putting  aside  the  question  of  the  basic  advisability  of  "fitting"  a  synthetic 
RUSSWO  to  a  natural  RUSSWO,  it  is  remarkable  how  close  a  fit  can  be  obtained  just 
by  "tuning"  the  cross-variable  correlation  coefficient  p  .  Consider  the  follow¬ 
ing  case  from  an  earlier  USAFETAC  project  (see  table  4-16).  The  model  was 
allowed  to  generate  100,000  synthetic  observations  of  ceiling  and  visibility. 
The  time  step  between  observations  was  chosen  to  be  24  hours.  In  this  example, 
p  and  p  were  set  to  the  same  value  (p  =  p__,  =  0.257).  This  was  done  so  p’ 
would  simplify  to  pcv  according  to  equation  (2-30).  The  largest  difference 
between  the  synthetic  and  the  natural  RUSSWO  is  only  0.038. 


Table  4-16.  Comparison  of  the  Scott  AFB,  IL  RUSSWO  and  Joint 
Probability  of  Ceiling  and  Visibility  Produced  by  WEASIM. 


RUSSWO  EXTRACT,  CEILING  VS.  VISIBILITY  SECTION 
SCOTT  AFB,  IL,  FEB,  1200-1400  LST 

Table  Based  on  2,863  Observations 
POR  1938,  1940-1972 


Visibility  (sm) 


Ceilinq  (ft) 

6.0 

4.0 

3.0 

2.0 

1.0 

0.5 

0.0 

10,000 

0.506 

0.548 

0.556 

0 . 560 

0 . 660 

0.560 

0.560 

3,000 

0.601 

0.669 

0.686 

0.692 

0.695 

0.695 

0.695 

2,000 

0.656 

0.749 

0.771 

0.780 

0.780 

0.787 

0.787 

1,000 

0.689 

0.817 

0.857 

0.876 

0.892 

0.896 

0.896 

200 

0.695 

0.836 

0.888 

0.928 

0.980 

0.998 

1.000 

0 

0.695 

0.836 

0.888 

0.928 

0.980 

0.998 

1.000 

0 


Synthetic  RUSSWO  Produced  by  NEMXN 
Scott  AFB,  XL.  FSB  1200  LST 

Tint  Stop  At  *  24  hr  p  «  0.72  No  Recording  Meek 
Teble  Beeed  on  100,000  Synthetic  obeervetione 


Visibility  (ffl 


Cailina  (ft) 

6.0 

4.0 

3.0 

hi 

hi 

0.5 

hi 

10,000 

0.400 

0.530 

0.541 

0.546 

0.540 

0.540 

0.540 

3,000 

0.603 

0,605 

0.710 

0.725 

0.732 

0.733 

0.733 

2,000 

0.634 

0.734 

0.766 

0.787 

0.797 

0.799 

0.799 

1,000 

0.667 

0.793 

0.841 

0.874 

0.893 

0.896 

0.897 

200 

0.682 

0.828 

0.890 

0.940 

0.976 

0.986 

0.989 

0 

0.682 

0.829 

0.829 

0.944 

0.983 

0.995 

1.000 

The  Scott  AFB  ceiling/visibility  example  is  one  in  which  the  multivariate 
normal  distribution  assumption  is  quite  reasonable.  Many  more  examples,  in  which 
the  simulated  joint  probability  distributions  are  quite  similar  to  the  observed 
data,  could  be  presented  here.  Synthetic  RUSSWOs  were  generated  for  10  different 
stations  from  the  European  coefficient  set.  These  stations  were  chosen  at  random 
from  England,  West  Germany,  and  the  Soviet  Union.  In  each  of  the  individual 
tests,  a  value  for  pcv  was  calculated  from  the  observed  joint  probability  distri¬ 
bution  of  ceiling  and  visibility  and  then  input  to  the  model.  In  nine  of  the  ten 
test  cases,  the  maximum  difference  between  the  synthetic  joint  probability  dis¬ 
tributions  and  the  observed  distributions  (RUSSWOs)  was  less  than  0.06.  The 
values  for  the  simulated  distributions  came  well  within  the  possible  error  inter¬ 
vals  imposed  from  sampling  theory.  This  is  due  to  the  fact  that  in  most  of  the 
cases,  the  observed  tables  were  based  on  relatively  small  sample  sizes.  Table 
4-17  shows  the  results  of  the  only  test  case  in  which  the  maximum  difference 
between  the  simulated  and  observed  distributions  was  greater  than  0.06.  This 
table  shows  the  results  from  a  simulation  run  involving  Zugspitze,  FRG. 

This  station  is  located  in  the  Alps,  at  an  elevation  of  2960  meters.  The 
joint  distribution  of  ceiling  and  visibility  is  decidedly  not  multivariate 
normal.  Depending  on  the  large  scale  meteorological  (synoptic)  situation,  the 
mountain  slope  is  either  shrouded  in  cloud,  with  both  ceiling  and  visibility 
approaching  zero,  or  the  flow  is  such  that,  the  mountain  is  clear,  with  visibil¬ 
ity  virtually  unlimited.  The  two  variables  are  highly  correlated  but  they  are 
not  multivariate  normally  distributed.  In  in  this  case,  one  of  the  basic  mathe¬ 
matical  assumptions  made  by  WEASIM  is  violated  and  some  major  differences  between 
the  observed  and  simulated  distributions  result.  In  the  Zugspitze  example,  the 
maximum  difference  between  the  two  distributions  is  0.165. 


Table  4-17.  Comparison  of  the  Zugspitze,  FRG ,  ,N-S>aiuuary  and  Joint 
Probability  of  Ceiling  and  visibility  Produced  by  VEASIM. 


N-SUMMARY  EXTRACT,  CEILING  VS.  VISIBILITY  SECTION 
ZUGSPITZF.,  FRG ,  JAN,  ALL  HOURS 

Table  Based  on  3,584  Observation': 

POR  49-54,  60-62,  64-71 


Visibility  (sni) 


Ceiling 

JL?U 

6.0 

;■  ■;  ; 

< 

;  ) 

8,000 

0. 53V 

0.  i't  .6'i'j 

0  ,  v.  . 

j  ;,c, 

3,300 

0.580 

0  98 1  O  1  8.' 

<•  .  V  •'>  •  ) 

0 . 688 

2,000 

O.bOO 

<  i .  <)  0  2  0.<o. 

0.-04  .■  ’ 

i-o. 

1,000 

0  .  b09 

0.611  0  ■  • !  2 

i- .  i  ‘7  .  ; 

0  6)6 

300 

0.617 

0 . 62')  1 

ci';  i  •  .  ,,’f 

c  (:>/■•> 

0 

0 . 624 

0.6.80  ‘  .  . 

f  .  i  i  .  :  •  '  » 

\ :  1 1 

Synthetic  N-Summary 
ZUGSPITZE,  FRG, 

Produced  by  WEA.UM 
JAN,  Af.L  HOOKS 

Time  Step  A  t  = 
Table  Based  on 

23  hr  p  -  .958  No  Recording  Mask 

100 , 000cSynthet ic  Obscrvati one 

’■  i  t  -• 

Ceiling 

JJ11 

6.0 

2  .  t>  i  2‘j* 

o  ■  oV. 6  v  il3  0.125 

u  .  0 

8,000 

0.379 

0.380  0.280 

0.380  ■' .  880  0  . 

0  3  8  0 

3,300 

0.424 

0.426  '’.4  2  6 

t* .  i •  •  .  *.  *.  •.  i  : 

'  ' 

2,000 

0.450 

0.451  0.452 

0.463  0.462  0.453 

0.466 

1,000 

0.486 

0.487  0.488 

0.48..  o  . !■  .  401 

0  49- 

300 

0.538 

0.642  6.644 

0  .  sc. v  ! 

r,  .  w  \ 

0 

0 . 698 

0 .  r>06  0.61,2 

o  c1  ;•  (  : 

!  .  0-00 

The  results  obtained  from  the  te  t  in  li.i-.  ->«;!  i  '  !  -i-.l  .  that  WKAiSlM  Call 
be  adjusted  or  "tuned”  to  fit  obscived  joint  probability  distributions  of  ceiling 
end  visibility.  In  moat  eases,  the  synthetic  joint  probability  distributions  of 
ceiling  end  viaibility  come  very  close  to  the  observed  distributions,  Howevet, 
the  model  feila  to  reproduce  the  joint  probability  distributions  at  certain  high* 
elevation  or  coeatel  stations  that  have  deeidely  nonmultivariate  normal  diatribe* 
tions  of  celling  end  visibility. 


w 


4.1.U  Opsrstionsl  flying  cstsgoris*.  Tht  tin  dsseribsd  in  ths  pfsvlous  part- 
graph*  vara  not  vary  atringant,  in  that  tha  aynthatle  obaarvationa  vara  ganaratad 
tor  only  ona  location  in  aaeh  toat  run.  Tho  input  eroaa-variahia  eorralation 
eoattieiont  was  adjuatod  apaeitieally  tor  aaeh  individual  location.  Thia  nathod 
ia  not  eonaiatant  vith  tha  way  tha  nodal  will  bo  uaad  operationally.  Penally, 
WCAS1N  will  be  reguired  to  ganarata  aynthatle  obaarvationa  for  aultiple  loca¬ 
tion*,  uaing  a  single  Mrapraaantativau  eroaa-variable  eorralation  eoattieiont. 
In  an  attort  to  nore  realistically  duplicate  how  tha  nodal  night  ba  uaad  opera¬ 
tionally.  a  nore  rigorous  teat  of  WCASIN's  ability  to  preaerve  tha  joint  proba¬ 
bility  of  ceiling  and  visibility  was  devised. 

Earlier  in  this  technical  note,  paragraph  3.2.4,  it  was  discussed  the  AWSTL 
archives  statistical  tabulations  of  historical  weather  data,  at  various  airfields 
worldwide.  These  tabulations  are  available  for  use  in  analyses  and  studies.  One 
form  of  these  tabulations  are  called  N-Summaries .  Part  of  the  N-Summary  is  a 
breakout  of  the  percent  of  time  that  the  selected  airfield's  weather  met  the 
criteria  for  certain  operational  flying  categories.  These  flying  categories  are: 

Flying  Category  A  —  Visibility  >2.5  SM  AND 

Ceiling  >  1000  ft 

Flying  Category  B  —  Visibility  >  1.25  SM  AND 

Ceiling  >  650  ft 

But  not  meeting  CAT  A  criteria 

Flying  Category  C  —  Visibility  <  1.25  SM  OR 

Ceiling  <  650  ft 

The  decision  to  test  the  model  against  these  criteria  was  made  because  the 
observed  data  already  existed  and  could  easily  be  extracted. 

N-Summaries  for  11  locations  were  selected  from  the  AWSTL.  The  11  airfields 
selected  for  this  series  of  tests  were  in  West  Germany,  East  Germany,  Denmark, 
Finland,  and  the  Soviet  Union.  Duplicate  test  runs  were  conducted  for  both  Janu¬ 
ary  and  July.  The  correlation  run  parameters,  for  each  month,  were  chosen  from 
the  various  tables  in  chapter  3.  The  specific  values  for  the  correlation  param¬ 
eter  used  in  the  operational  flying  category  tests  were  those  recommended  for 
northern  Europe  overall,  in  each  of  the  tables,  and  are  listed  in  table  4-18. 

WEASIM  was  allowed  to  generate  8,000  synthetic  observations  at  each  location. 
The  time  increment  between  observations  was  23  hours.  Many  of  the  tests  pre¬ 
sented  in  this  chapter,  this  value  for  the  time  step  between  observations  was 
chosen  to  minimize  the  amount  of  serial  dependence  between  observations.  Choos¬ 
ing  it  of  23  hours  also  allowed  observations  from  all  hours  to  be  generated,  thus 
testing  all  eight  sets  of  ceiling/visibility  modeling  coefficients  for  each  loca¬ 
tion.  After  each  replication  of  the  model,  the  proper  flying  category  for  each 
station's  synthetic  observation  was  determined  and  the  information  was  accumu¬ 
lated  for  later  calculations.  At  the  end  of  the  simulation,  the  percent  of  time 
that  each  location  was  in  each  of  the  specific  flying  categories  was  calculated. 


Table  4-18.  Correlation  Parameters  Used  in  the  Operational  Plying 
Category  Tests . 


JANUARY 

Pcv  0-520 

Expontential 
Decay  Constants 

Ceiling  0.921 

visibility  0.932 

Sawtooth 

Wavelengths 

(km) 

Ceiling 


Minimum  625 

Maximum  1708 

Visibility 

Minimum  485 

Maximum  1327 


#  OF  SAWTOOTH  WAVES  9 


JULY 

0.267 


0.921 

0.937 


646 

1764 


767 

2094 

9 


These  values  were  then  used  to  compare  against  those  values  found  in  the  i 

.  i 

N-Summaries.  I 

i 

The  joint  probability  distribution  of  the  synthetic  values  of  ceiling  and 
visibility  were  assumed  to  be  the  hypothetical  distribution.  The  hypothesis 
tested  was,  “Could  the  distributions  listed  in  the  N-Summaries  have  come  from  the 
hypothetical  distributions?"  The  results  of  the  tests  are  listed  in  tables  4-19 
(January)  and  4-20  (July).  The  N-Summaries  were  made  up  from  observations  of 
varying  periods  of  record.  Some  of  the  PORs  were  as  short  as  5  years,  the  long¬ 
est  was  25  years.  These  observations  are  taken  at  3-hour  intervals,  so  there  is  i 

serial  dependence  involved  in  the  total  number  of  observations.  The  sample  sizes  | 

that  went  to  make  up  the  observed  distributions  must  be  adjusted  for  that  serial  \ 

dependence.  To  make  this  adjustment,  an  average  serial  correlation  coefficient 
of  0.805  for  a  3-hour  separation  was  assumed.  Equation  (2-90)  was  then  used  to  ' 

i 

adjust  the  sample  size  for  serial  dependence  for  each  location.  These  are  the  > 

values  for  N'  listed  in  the  tables.  ■ 

i 

The  chi-square  test  was  performed  using  the  relative  frequencies  in  each  of 
the  flying  categories  for  the  generated  observations  and  those  extracted  from  the 
N-Summaries.  The  critical  chi-square  value  for  a  significance  level  of  0.05  and  ; 

two  degrees  of  freedom  (3  categories  -  1)  is  5.99.  Table  4-19  shows  that  in  ■ 

January  the  computed  chi-square  value  is  less  than  the  critical  chi-square  value  | 

in  eight  of  the  11  cases.  Table  4-20  shows  that  for  the  July  case,  the  computed  j 
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chi-square  value  is  less  than  the  critical  chi-square  value  in  nine  of  11  cases. 
If  the  Zuqspitze  is  disregarded  (it  was  shown  in  the  preceding  section  that  the 
multivariate  normal  assumption  does  not  hold  there),  then  the  computed  chi-square 
value  was  less  than  the  critical  chi-square  value  in  17  of  the  20  test  cases. 
The  chi-square  value  for  an  individual  location  could  be  minimized  by  adjusting 
i>cv .  However,  this  exercise  was  designed  to  test  the  overall  power  of  the  model. 

The  chi-square  values  for  July  were,  on  the  average,  lower  than  the  values 
for  January.  This  result  also  goes  along  with  that  illustrated  in  tables  4-14 
and  4-15,  i.e.,  that  the  model  comes  closer  in  July  to  describing  the  observed 
distributions  of  ceiling  and/or  visibility  than  in  January.  NOTE:  By  comparing 
tables  4-19  and  4-20,  the  model  realistically  depicts  the  seasonal  variation  of 
flying  conditions  at  these  locations. 

The  results  obtained  from  the  flying  category  test  indicate  that  there  is  no 
clear  cut  reason  to  reject  the  hypothesis  that  the  observed  distributions  listed 
in  the  N-Summaries  could  have  come  from  the  hypothetical  distributions  produced 
from  WEASIM,  within  the  limits  imposed  by  sampling  error.  WEASIM  appears  to 
adequately  reproduce  the  percent  of  time  that  individual  airfields  experience 
selected  flying  conditions. 

4. 3. 3. 3  Joint  Distributions  of  Ceiling  and  of  Visibility  in  Space.  The  final 
test  of  WEASIM  to  be  discussed  is  how  well  the  model  preserves  the  joint  proba¬ 
bility  distributions  of  ceiling  and  of  visibility  in  space.  In  paragraph  4.2.2, 
it  was  shown  that  the  model  preserves  spatial  correlation  in  END  space.  The 
question  remains,  "How  well  does  the  model  preserve  the  joint  probability  of 
occurrence  of  ceiling  or  visibility  once  the  ENDS  at  various  locations  are  trans¬ 
formed  to  raw  values  of  those  variables?"  For  this  purpose,  some  of  the  joint 
probability  tables  that  were  generated  by  USAFETAC,  OL-A,  and  discussed  in  para¬ 
graph  3.2.5,  were  used.  The  goal  of  the  test  then,  was  to  see  how  well  WEASIM 
could  reproduce  the  observed  joint  probability  distributions  of  ceiling  and  of 
visibility,  for  selected  station  pairs.  Two  sets  of  tests  were  conducted.  For 
the  first  set  of  tests,  five  locations  were  selected  from  southern  West  Germany 
and  Austria.  For  the  second  set  of  tests,  five  locations  were  selected  from 
central  West  Germany.  The  stations  were  specifically  selected  because  the 
observed  joint  distributions  were  readily  available  and  the  data  was  easily 
extracted.  Each  set  of  tests  consisted  of  a  run  for  both  January  and  July.  The 
model  was  allowed  to  generate  5,000  synthetic  observations  for  each  location 
using  the  correlation  parameters  listed  in  table  4-18.  During  each  simulation 
run,  the  frequency  of  occurrence  that  ceiling  and  visibility  exceeded  certain 
thresholds,  at  selected  station  pairs  simultaneously,  was  acculmulated.  At  the 
termination  of  each  run,  the  joint  probability  of  occurrence  was  calculated  by 
dividing  the  frequency  counts  by  the  total  sample  size. 

In  spatial  distribution  test  #1,  Kaufbueren,  FRG  was  used  as  a  base  station. 
The  joint  probability  of  occurrence  that  the  ceiling  and  visibility  exceeded 


specific  thresholds  at  Kaufbueren  and  each  of  the  other  locations  was  calculated. 
In  spatial  distribution  test  #2,  Frankfurt,  FRG,  was  chosen  as  the  base  station. 
The  results  in  tables  4-21  through  4-24  show  that  the  model  gives  a  fairly  faith¬ 
ful  representation  of  the  joint  probability  distributions  as  was  observed  and 
reported  by  these  stations.  NOTE:  The  simulated  distributions  in  this  test  also 
show  the  seasonal  variations  that  are  evident  in  the  observed  joint  probability 
distributions.  The  maximum  difference  between  the  simulated  and  observed  distri¬ 
butions  were  slightly  lower  in  July  then  in  January.  In  nine  of  the  16  cases, 
the  maximum  difference  is  less  than  0.03.  In  five  of  the  seven  cases  that 
exceeded  0.03,  the  maximum  difference  occurs  in  "good  weather"  categories  (e.g., 
VSBY  >  4  SM  or  CIG  >  2,000  ft). 

4-4  Summary  and  Conclusions 

A  multiple-station,  two-variable  model  has  been  constructed  and  tested.  The 
model  has  been  found  to  preserve  the  serial  correlation  pcc  of  ceiling  over  time, 
the  serial  correlation  pw  of  visibility  and  the  cross-correlation  pCT  of  ceiling 
and  visibility.  The  cross-lag  correlation  pvtct+1  in  this  model  reduces  to  the 
automatic  correlation  P^P^*  Furthermore,  the  spatial  correlation  decay  func¬ 
tions  of  ceiling  and  visibility,  as  modeled  by  Gringorten's  Model-B,  is 
reproduced . 


The  model  appears  to  preserve  the  marginal  distributions  of  ceiling  and 
visibility  (as  described  by  the  Weibull  functions),  and  gives  a  faithful  repre¬ 
sentation  (in  most  cases)  of  the  joint  probabilities  between  ceiling  and  visibil¬ 
ity,  as  shown  in  the  interior  of  RUSSWOs  and  N-Summaries.  USAFETAC  currently  has 
the  capability  for  fitting  the  Weibull  curves  to  visibility  and  ceiling  data  from 
reporting  weather  stations  all  over  the  globe. 
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Appendix  A 


CALCULATION  OF  CORRELATION  COEFFICIENTS 


Throughout  this  technical  note,  correlation  of  two  variables  (whether  in 
space  or  tine)  was  discussed.  This  appendix  was  added  to  define  what  is  meant  by 
this  term  and  to  discuss,  in  detail,  the  methods  used  to  calculate  the  various 
correlation  coefficients  that  were  referenced. 

Correlation  is  a  measure  of  association,  not  of  causation.  Loosely,  it  can 
be  said  that  correlation,  and  in  particular  the  various  correlation  "coeffici¬ 
ents,"  are  measures  of  relatedness  among  variables.  Statistical  relatedness  does 
not  necessarily  imply  physical  causation.  Correlation,  then,  is  a  way  of  expres¬ 
sing  the  association  between  variables,  an  association  that  need  not  be  causal  in 
nature . 

In  the  last  paragraph  the  ideas  about  correlation  were  expressed  qualitative¬ 
ly.  In  statistics,  however,  the  "correlation  coefficient"  or  just  "correlation" 
is  a  quantitative  concept,  capable  of  being  expressed  in  numbers  that  describe 
the  degree  of  relatedness  or  association  between  variables.  A  scale  for  measur¬ 
ing  correlation  is  desirable.  When  there  is  no  relationship  between  variables, 
this  statistical  measure  ought  to  approach  zero.  The  measure  of  correlation 
should  approach  unity  when  the  relationship  between  variables  is  very  high. 

A  general  definition  of  correlation  can  be  set  down:  two  measurable  charac¬ 
teristics,  A  and  B,  are  said  to  be  correlated  when,  with  different  values  x  of  A, 
the  same  value  y  of  B  is  not  equally  likely  to  be  associated.  In  other  words, 
certain  values  of  B  are  more  likely  to  occur  with  the  value  x  than  others.  If 
they  were  not,  correlation  would  be  absent.  Correlation  would  be  perfect  if  for 
every  value  of  A  the  same  value  of  B  occurred. 

The  correlation  coefficient  measures  the  relative  importance  of  the  relation¬ 
ship  between  two  variables  in  a  nondimens ional  sense,  i.e.,  it  does  not  depend 
upon  any  arbitrary  choice  of  units  by  which  the  original  variables  were  measured. 
The  concept  of  a  theoretical  population  correlation  coefficient  can  be  developed 
along  these  lines: 

Let  X  and  Y  be  two  random  variables  on  a  sample  space  S  such  that 

X(S)  *  (x,,  xt,  •••»  AjjJ  (A— 1) 

Y($)  ■  {y»,  y« . yn)  (A-2) 

with  joint  probability  density  function  fXY<*  ,y).  Then  the  covariance  of  X  and 
Y,  denoted  by  Cov(X,Y),  is  defined  by 


(A-3) 


Cov(X,Y)  s  E[(X  -  mx)<Y  -  My)] 

■  E(XY)  -  E(X)E(Y)  (A-4) 

*  E(XY)  -  MxMy  (A-5) 

where  mx  is  the  mean  of  X  and  My  is  the  mean  of  Y.  Assuming  a  linear  relation¬ 
ship  between  the  random  variables  X  and  Y  results  in  the  following  expression  for 
the  theoretical  linear  correlation  coefficient  between  X  and  Y: 


XY 


Cov(X.Y) 
°X  °Y 


(Linear)  (A-6) 


where  ox  is  the  standard  deviation  of  X  and  oy  the  standard  deviation  of  Y. 

Equation  (A-6)  is  an  expression  for  the  theoretical  linear  correlation  coef¬ 
ficient.  For  problems  requiring  sampling  of  actual  data,  it  is  not  the  theoret¬ 
ical  correlation  coefficient  p  but  rather  the  sample  correlation  coefficient  r 
that  is  of  interest. 


An  expression  for  the  sample  correlation  coefficient  r^  can  be  developed  by 
considering  a  set  of  (X,Y)  data  pairs,  where  Y  is  considered  a  function  of  X: 

Y  =  f(X)  (General)  (A-7) 

Here  Y  are  actual  values  of  the  dependent  variable,  and  Y  are  the  Y-values 
predicted  by  the  function  f.  If  f  is  a  linear  function,  then  equation  (A-7) 
particularizes  to 


Y  =  a0  ♦  atX  (Linear)  (A-8) 

The  linear  function  f  in  one  independent  variable  may  not  perfectly  describe 
the  behavior  of  the  Y-data.  There  may,  for  example,  be  independent  variables 
other  than  X  that  are  important  in  predicting  Y,  or  there  may  be  a  nonlinear 
dependence  involved.  The  scatter  of  actual  Y-values  about  the  prediction  Y  given 
by  equation  (A-8)  can  be  described  in  terms  of  the  standard  error  of  the  estimate 
of  Y  on  X,  given  by 


Syx  =  7l[(Y  -  Y)2]  /  N  (General)  (A-9) 

which  applies  to  both  linear  and  nonlinear  associations  of  the  form  shown  in 
equation  (A-7).  If  the  linear  association  of  equation  (A-8)  is  used,  then  equa¬ 
tion  (A-9)  becomes 


(Linear) 


(A-10) 


_  *  _  XY2  -  aoXY  -  a<2XY 

XY - IT - - 

a  Measure  of  the  standard  error  of  the  linear  estinate  of  Y  on  X. 

The  total  variation  of  Y  is  defined  as  Z(Y  -  ?)*,  the  sum  of  the  squares  of 
the  deviations  of  Y  from  its  aean  Y.  That  total  variation  of  Y  can  be  parti¬ 
tioned  into  an  unexplained  variance  2(Y  -  Y)2  and  an  explained  variance  2(Y  -  Y)2: 

2(Y  -  Y)2  *  X(Y  -  ?)2  ♦  2(?  -  Y)2  (General)  (A-ll) 

Total  Unexplained  Explained 

Variation  Variation  Variation 

where  Y  is  an  estinated  value  of  Y  based  on  the  value  of  X  Md  the  functional 
relationship  expressed  by  equations  (A-7 )  and  (A- 8 ) .  The  first  term  on  the  riqht 
of  equation  (A-ll)  is  called  the  "unexplained"  variation  because  the  deviations 
behave  in  an  apparently  random  or  unpredictable  manner.  The  second  tern  on  the 
right  of  equation  (A-ll)  is  called  the  "explained"  variation  because  the  devia¬ 
tions  involved  have  a  definite  pattern. 

The  sample  correlation  coefficient  r^  between  the  variables  X  and  Y  is  given 
by 


rXY  *  * 


/ 


Explained  Variation 
Total  Variation 


I  X(Y  -  Y)2 
^  2(Y  -  ?)* 


(General)  (A-l 2) 


which  varies  between  -1  (perfect  negative  correlation)  and  +1  (perfect  positive 
correlation)  and  is  nondimensional  and  independent  of  the  origin.  The  i  sign  is 
used  to  introduce  the  sign  of  the  correlation.  Equation  (A-12)  is  a  general 
expression  for  the  correlation  coefficient  and  can  be  used  for  linear  or  non¬ 
linear  correlation.  Using  equation  (A-9)  in  (A-ll),  and  making  use  of  the  fact 
that  the  standard  deviation  of  Y  is 


8y  =  JlHY  -  ?)*]  /  N 


(A-13) 


permits  equation  (A-12)  to  be  written  as 


(General) 


(A- 14) 


rXY  =  i/(*Y2  '  sYX2)  /  SY* 

Like  equation  (A-12 ) ,  equation  (A-14)  is  general  and  can  be  used  for  nonlinear 
and  linear  correlation.  If  f  is  computed  from  a  nonlinear  function  (equation 
A-7 ) ,  and  the  i  signs  are  oaiitted,  then  equations  (A-12 )  and  (A-14)  describe 
nonlinear  correlation. 


rXY  = 


I(X  -  X) (Y  -  Y) 

VTDT7P  VT7Y=7P 


where  the  covariance  of  x  and  Y  is 


(Linear)  (A-15) 


and  the  standard  deviations  are  given  by  equation  (A-13)  for  8y  and  by  its 
analogue  for  s^. 

If  equations  (A-16),  (A-13),  and  the  analogue  mentioned  immediately  above  are 
used  in  equation  (A-15),  the  result  is 


rXY  =  ~Ip-—  (Linear)  (A-17) 

which  parallels  equation  (A-6)  for  the  linear  population  correlation  p^y. 

The  interpretation  attached  to  the  sample  correlation  coefficient  r^  depends 
on  the  functional  form  introduced  in  equation  (A-8)  for  the  association  between 
the  two  variables  X  and  Y.  If  a  linear  association  is  assumed,  then  r^  is  cal¬ 
culated  from  equation  (A-15)  and  measures  the  extent  to  which  a  linear  dependence 
of  Y  on  X  explains  the  variation  of  Y  data,  and  r^2  becomes  the  fraction  of  the 
total  variation  of  Y  explained  by  a  linear  dependence  on  X.  If  a  nonlinear  asso¬ 
ciation  is  used  for  equation  (A-7),  then  r^2  —  which  can  then  no  longer  be  cal¬ 
culated  from  equation  (A-15) — is  the  fraction  of  the  total  variation  of  Y 
explained  by  a  particular  nonlinear  association  with  X.  Just  because  there  is  no 
linear  correlation  between  the  variables  X  and  Y  does  not  mean  there  is  no  cor¬ 
relation  at  all.  There  may  in  fact  be  a  high  nonlinear  correlation  between  the 
variables. 
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The  stochastic  process  model  developed  in  this  technical  note  employs  linear 
correlation  methods  exclusively.  Therefore,  throughout  this  technical  note,  all 
references  to  correlation  refer  to  linear  correlation. 

USAFETAC  uses  two  methods  most  frequently  when  calculating  linear  correlation 
coefficients:  (1)  the  Pearson  product  moment  (PPM)  formula,  and  (2)  the  tetra- 
choric  method. 

The  Pearson  product-moment  (PPM)  formula  for  calculating  linear  correlation 
coefficients  is  basically  equation  (A-15).  That  form  of  the  equation  is  computa¬ 
tionally  inefficient  because  it  requires  the  means  to  be  known  in  advance  (from 
an  earlier  pass  through  the  data).  Algebraic  manipulation  of  equation  (A-15) 
produces  a  computationally  efficient  PPM  formula: 

r  =  N2XY  -  (IX) (IY)  (A.18| 

**  VTHSXMSX)1]  /ITEY^TTYF] 

The  correlation  coefficient  rxy  is  symmetrical  in.X  and  Y.  This  symmetry 
indicates  that  the  coefficient  of  correlation  does  not  distinguish  between  the 
dependent  and  independent  variable.  It  permits  conclusions  about  the  existence 
of  a  linear  relationship  .between  two  variables  but  not  about  which  "depends"  on 
the  other.  Although  it  is  relatively  easy  to  implement  on  a  computer,  Pearsons' 
method  has  the  inherent  disadvantage  of  requiring  the  raw  data  to  be  available 
for  the  computations.  The  tetrachoric  method  offers  the  advantage  of  being  able 
to  use  data  that  has  already  been  categorized. 

Any  two  variables  can  be  reduced  to  a  two-by-two  table: 


X 

Above  Below 


A 

B 

V 

y  yT 

B  1 

L 

W 


where  A,  B,  C,  and  D  are  the  number  of  cases  above  or  below  the  critical,  or 
threshold  values,  (xT  and  y^)  of  the  respective  variables.  An  approximation  to 
the  tetrachoric  correlation  coefficient  (rt)  can  be  obtained  by  equation  (A-19): 


A 

B 

C 

D 

(A-19) 


n  vro  -  JVC 
=  sin  I  - 


2  JKC  +  JVC 


] 


This  equation  is  accurate  where  (A  +  B)/N  and  (A  +  C)/N  are  close  to  0.5 
(where  N  is  the  total  number  of  cases),  but  may  contain  sizeable  error  for  values 
near  one  or  zero.  Since  there  is  no  simple  exact  formula  for  calculating  rfc(  an 
algorithm  based  on  the  false  position  method  (Acton,  1970)  is  used  by  USAFETAC. 
The  coefficient  is  evaluated  at  two  initial  guess  values,  and  linear  interpola¬ 
tion  is  used  to  find  a  better  estimate.  The  quantity  rt  behaves  in  a  manner 
similar  to  an  ordinary  linear  correlation  coefficient,  but  the  exact  numerical 
value  is  not  completely  comparable.  The  value  of  rfc  varies  from  -1  to  +1,  giving 
zero  for  no  relation,  but  the  sign  depends  in  a  rather  arbitrary  manner  on  the 
arrangement  of  the  contingency  table. 


-W 


Appendix  B 


AFGWC 


AWSTL 


/COEF/ 


COMMON 


/CORREL/ 


DATSAV 


/ENDFLD/ 


FORTRAN 


LIST  OF  ABBREVIATIONS  AND  ACRONVMS 


Air  Base 
above 

Air  Force  Base 

Air  Force  Global  Weather  Central 
American  National  Standard 
Austria 

automated  weather  network 

Air  Weather  Service  Technical  Library 

below 

Belgium,  Netherlands,  Luxembourg 
cumulative  distribution  function 
ceiling 

a  labelled  COMMON  block  within  WEASIM  used  to  store  the 
modeling  coefficients  for  ceiling  and  visibility 

area  of  computer  storage  shared  between  the  main  program  and 
its  subprograms 

a  labelled  COMMON  block  within  WEASIM  used  to  store  correla¬ 
tion  parameters 

ANS  FORTRAN  library  function  to  compute  cosines 
cosine,  a  trigonometric  function 
central  processing  unit 

computer  tape  format  in  which  decoded  weather  observations 
are  stored  at  USAFETAC 

degree  of  arc 

Detachment,  a  type  of  Air  Force  unit 

equivalent  normal  deviate,  the  standard  normal  variable 

a  labelled  COMMON  block  within  WEASIM  used  store  the  ENDS  of 
ceiling  and  visibility 

Federal  Republic  of  Germany 

formula  translation,  a  computer  language  suitable  for  scien¬ 
tific  problem  solving 


great  circle  distance 
German  Democratic  Republic 


k 


7T*gTCTBWVrswr*3WS*3; 


GMT 

/GRIDCM/ 

HQ  USAF/SA 
HQ  USAF/SAGW 

IBM 

IL 

I/O 

km 

lat 

lng 

LST 

m 

ME 

NC 

ME 

NM 

N-Summary 

/OBS/ 

OL 

OS/VS 2  MVS 

O-U 

POR 

PPM 

RAD 

RAF 

RMS 

RUSSWO 

SAS 

SD 

SIN 


Greenich  mean  time 

a  labelled  COMMON  block  within  WEASIM  used  to  store  the 
values  for  the  trigonometric  equivalents  of  the  latitude  and 
longitude  of  the  locations  being  modeled 

Headquarters  United  States  Air  Force  studies  and  Analyses 

the  office  for  the  Special  Assistant  for  Weather  Support 
within  HQ  USAF/SA 

International  Business  Machines 

Illinois 

input-output  function  for  a  computer  program 

kilometer 

latitude 

longitude 

local  standard  time 

meter 

Maine 

North  Carolina 
Nebraska 
nautical  mile 

a  statistical  tabulation  of  historical  weather  data  for  a 

single  location,  produced  by  USAFETAC - a  precursor  of  the 

RUSSWO 

a  labelled  COMMON  block  within  WEASIM  used  to  store  the  out¬ 
put  values  of  synthetic  ceiling  and  visibility 

Operating  Location,  a  type  of  Air  Force  Unit 

IBM  multiple  virtual  storage  operating  system 

Ornstein-uhlenbeck 

period  of  record 

Pearson  product  moment  formula  for  the  calculation  of  the 
correlation  coefficient  from  sample  data 

radians 

Royal  Air  Force 
root  mean  square 

Revised  Uniform  Summary  of  Surface  Weather  Observations,  a 
statistical  tabulation  of  historical  weather  data  for  a 
single  location,  produced  by  USAFETAC 

statistical  analysis  system 

scale  distance,  a  modeling  parameter 

ANS  FORTRAN  library  function  to  compute  sines 


B-2 


>3“ 


sin 

SM 

Std  Dev 
UK 

USAFETAC 

USAFETAC/DNM 

USAFETAC/DNY 
USAFETAC,  OL-A 

USSR 

VSBY 

WEAINT 

WEASIM 

WHO 


sine,  a  trigonometric  function 
statute  mile 
standard  deviation 
United  Kingdom 

United  States  Air  Force  Environmental  Technical  Applications 
Center 

the  office  within  USAFETAC  that  provides  probability  and 
statistics  consultation  services 

the  Environmental  Simulation  Section  within  USAFETAC 

an  operating  location  of  USAFETAC  in  Asheville,  NC,—  the 
unit  is  co-located  with  the  National  Climatic  Center  and  acts 
as  data  base  manager  for  all  USAFETAC  cl  l...atic  data  sets 

Union  of  Soviet  Socialist  Republics 

visibility 

Fortran  subroutine  provided  to  HQ  USAF/SA  that  acts  as  an 
initialization  routine  for  WEASIM 

Fortran  subroutine  that  represents  the  Ceiling/Visibility 
Simulation  Model 

world  Meteorological  Organization 
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